2ms/EXAONE-Path-CRCMSI-Predictor

Introduction

EXAONE Path MSI is an enhanced whole-slide image (WSI) classification framework that retains the core architecture of EXAONE Path while upgrading its internals for greater efficiency and richer multimodal integration.

The pipeline still unfolds in two stages:

Patch-wise feature extraction – Each WSI is tiled into 256 × 256 px patches, which are embedded into 768-dimensional vectors using the frozen EXAONE Path encoder.
Slide-level aggregation – The patch embeddings are aggregated using a Vision Transformer, producing a unified slide-level representation that a lightweight classification head transforms into task-specific probabilities.

Key Improvements

FlexAttention + torch.compile
What changed: Replaced vanilla multi‑head self‑attention with IO‑aware FlexAttention kernels and enabled torch.compile to fuse the forward/backward graph at runtime. The new kernel layout dramatically improves both memory efficiency and training-and-inference throughput.
Coordinate‑aware Relative Bias
What changed: Added an ALiBi‑style distance bias that is computed from the (x, y) patch coordinates themselves, allowing the ViT aggregator to reason about spatial proximity.
Scalable Mixed‑Omics Encoder (Token‑mixing Transformer)
What changed: Each omics modality is first tokenised into a fixed‑length set. All modality‑specific tokens are concatenated into a single sequence and passed through a shared multi‑head self‑attention stack, enabling direct information exchange across modalities in one shot. The aggregated omics representation is subsequently fused with image tokens via cross‑attention. This release uses three modalities (RNA, CNV, DNA‑methylation), but the design is agnostic to modality count and scales linearly with token number.

Quick Start

Requirements

NVIDIA GPU (≥ 40 GB)
CUDA 12.8
pytorch 2.7.0+cu128

Installation

git clone https://huggingface.co/LGAI-EXAONE/{MODEL_NAME}.git
cd {MODEL_NAME}
pip install -r requirements.txt

Quick Inference

from models.exaonepath import EXAONEPathV1p5Downstream
 
hf_token = "YOUR_HUGGING_FACE_ACCESS_TOKEN"
model = EXAONEPathV1p5Downstream.from_pretrained(
    "LGAI-EXAONE/{MODEL_NAME}",
    use_auth_token=hf_token
)
probs = model("./samples/wsis/1/1.svs")
print(f"P(CRCMSI mutant) = {probs[1]:.3f}")

Command‑line

python inference.py --svs_path ./samples/wsis/1/1.svs

Model Performance Comparison

Metric (AUC) / Task	Titan (Conch v1.5 + iBot, image-text)	PRISM (virchow + perceiver, image-text)	CHIEF (CTransPath + CLAM, image-text, WSI-contrastive)	Prov-GigaPath (GigaPath + LongNet, image-only, mask-prediction)	UNI2-h + CLAM (image-only)	EXAONE Path 1.5	EXAONE Path MSI
CRC-MSI	0.9370	0.9432	0.9273	0.9541	0.9808	0.9537	0.9844