Reading from Object Stores (e.g. S3)
If the path you pass is to an object store, currently S3 and GCS are supported, then Exon will attempt to get your credentials from the environment and read the file.
For example, to read a file from S3:
import os
import biobear as bb
# You can also set these in your shell
os.ENVIORN["AWS_PROFILE"] = "my-profile"
os.ENVIORN["AWS_DEFAULT_REGION"] = "us-east-1"
# requires polars to be installed
session = bb.new_session()
df = session.read_fasta_file("s3://bucket/test.fa").to_polars()
S3 and S3 Compatible APIs
For S3, if you're on a personal computer setting the AWS_PROFILE
and AWS_DEFAULT_REGION
environment variables should be sufficient.
For automated use, a recommendation is harder to give as it dependents on the specific use case, though you can set some combination of to indicate the credentials to use:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
AWS_ENDPOINT
AWS_SESSION_TOKEN
AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
CloudFlare R2
The CloudFlare R2 service has an S3 compatible API and thus is supported by overwriting the relevant environment variables:
AWS_ACCESS_KEY_ID # your cloudflare access key id
AWS_SECRET_ACCESS_KEY # your cloudflare secret access key
AWS_DEFAULT_REGION # the region to use, likely `auto` should be sufficient
AWS_ENDPOINT # the endpoint to use, e.g. https://$CLOUDFLARE_ACCOUNT_ID.r2.cloudflarestorage.com
After that, pass the path to biobear
tools like: s3://bucket/path/to/file
, where bucket is the bucket in CloudFlare and the path is the path in CloudFlare relative to the bucket.
LocalStack
LocalStack is a useful tool for local development and testing. It provides a local S3 compatible API, among other things. To use it, set the following environment variables:
AWS_ACCESS_KEY_ID # your localstack access key id
AWS_SECRET_ACCESS_KEY # your localstack secret access key
AWS_DEFAULT_REGION # the region to use
AWS_ENDPOINT_URL # the endpoint, e.g. if running on the default port http://localhost:4566
AWS_ALLOW_HTTP # allow http connections, useful for local development
GCS
For GCS, you can use a service account, either the path to the service account file or the JSON serialized service account key:
GOOGLE_SERVICE_ACCOUNT: location of service account file
GOOGLE_SERVICE_ACCOUNT_KEY: JSON serialized service account key