EVA-RNA: Foundation Model for Transcriptomics
Transformer-based foundation model that produces sample-level and gene-level embeddings from RNA-seq profiles (bulk, microarray, pseudobulked single-cell) in human and mouse.
Installation
We recommend proceeding with the uv package manager.
uv venv --python 3.10
source .venv/bin/activate
uv pip install transformers torch==2.6.0 scanpy anndata tqdm scipy scikit-misc
Optional: Flash Attention
To handle larger gene contexts, EVA-RNA automatically runs on Flash Attention if available -- only available for post-Ampere GPUs (A100 and beyond). We recommend using the following wheel.
uv pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
Quick Start
import scanpy as sc
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("ScientaLab/eva-rna", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("ScientaLab/eva-rna", trust_remote_code=True)
# Load example dataset (2,700 PBMCs, raw counts)
adata = sc.datasets.pbmc3k()
# Subset to 2,000 highly variable genes for efficiency
sc.pp.highly_variable_genes(adata, n_top_genes=2000, flavor="seurat_v3")
adata = adata[:, adata.var.highly_variable].copy()
# Encode (gene symbols auto-converted, preprocessing applied, GPU used if available)
embeddings = model.encode_anndata(tokenizer, adata)
adata.obsm["X_eva"] = embeddings
Options
model.encode_anndata() accepts the following parameters:
gene_column— column inadata.varwith gene identifiers (default: usesadata.var_names)species—"human"or"mouse"for gene ID conversion (default: auto-detected)batch_size— samples per inference batch (default: 32)device—"cpu","cuda", etc. (default: CUDA if available)show_progress— show a progress bar (default: True)preprocess— apply library-size normalization + log1p (default: True); set to False if data is already log-transformed
Advanced: Raw Tensor API
For users who need direct control over inputs (mixed precision is applied automatically):
import torch
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("ScientaLab/eva-rna", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("ScientaLab/eva-rna", trust_remote_code=True)
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device).eval()
# Gene IDs must be NCBI GeneIDs as strings
gene_ids = ["7157", "675", "672"] # TP53, BRCA2, BRCA1
expression_values = [5.5, 3.2, 4.1] # log1p-normalized
inputs = tokenizer(gene_ids, expression_values, padding=True, return_tensors="pt")
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.inference_mode():
outputs = model(**inputs)
sample_embedding = outputs.cls_embedding # (1, 256)
gene_embeddings = outputs.gene_embeddings # (1, 3, 256)
Batch Processing
batch_gene_ids = [
["7157", "675", "672"],
["7157", "1956", "5290"],
]
batch_expression = [
[5.5, 3.2, 4.1],
[2.1, 6.3, 1.8],
]
inputs = tokenizer(batch_gene_ids, batch_expression, padding=True, return_tensors="pt")
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.inference_mode():
outputs = model(**inputs)
sample_embeddings = outputs.cls_embedding # (2, 256)
Expression Decoder
EVA-RNA includes a pre-trained deterministic expression decoder that maps gene embeddings back to predicted expression values.
with torch.inference_mode():
# Encode
output = model.encode(**inputs)
# output.cls_embedding — sample-level embedding (batch, hidden_size)
# output.gene_embeddings — per-gene embeddings (batch, n_genes, hidden_size)
# Decode expression values
predicted_expression = model.decode(output.gene_embeddings)
# predicted_expression — (batch, n_genes)
GPU and Precision
EVA-RNA automatically applies mixed precision for optimal performance:
- Ampere+ GPUs (A100, H100, RTX 30/40 series): bfloat16
- Older CUDA GPUs (V100, RTX 20 series): float16
- CPU: full precision (float32)
No manual torch.autocast() is needed.
Note — Flash Attention constraints: When flash attention is installed and an Ampere+ GPU is detected, the model uses flash attention layers. These layers require CUDA and half-precision inputs. If you move the model to CPU you will get a clear error asking you to move it back to GPU. If you pass
autocast=False, autocast is re-enabled automatically with a warning since flash attention cannot run in full precision.
Disabling Automatic Mixed Precision
For advanced use cases requiring manual precision control, pass autocast=False.
This only takes effect when flash attention is not active (i.e., on older GPUs or
when flash attention is not installed):
model = model.to("cuda").eval()
with torch.inference_mode():
# Disable automatic mixed precision (ignored when flash attention is active)
outputs = model(**inputs, autocast=False)
# Or via sample_embedding
embedding = model.sample_embedding(
gene_ids=gene_ids,
expression_values=values,
autocast=False,
)
Converting Gene Symbols to NCBI Gene IDs
The tokenizer vocabulary uses NCBI GeneIDs. A built-in gene mapper is included to convert gene symbols or Ensembl IDs:
tokenizer = AutoTokenizer.from_pretrained("ScientaLab/eva-rna", trust_remote_code=True)
# Available mappings:
# "symbol_to_ncbi" – human gene symbols → NCBI GeneIDs
# "ensembl_to_ncbi" – human Ensembl IDs → NCBI GeneIDs
# "symbol_to_ncbi_mouse" – mouse gene symbols → NCBI GeneIDs
mapper = tokenizer.gene_mapper["symbol_to_ncbi"]
gene_symbols = ["TP53", "BRCA2", "BRCA1"]
gene_ids = [mapper[s] for s in gene_symbols]
# gene_ids = ["7157", "675", "672"]
expression_values = [5.5, 3.2, 4.1]
inputs = tokenizer(gene_ids, expression_values, padding=True, return_tensors="pt")
Citation
@article{eva-rna,
title={EVA: Towards a universal model of the immune system},
author={Scienta Team},
journal={arXiv},
year={2026},
}
License
- Downloads last month
- 152