MORPH 1.0
MORPH: Shape-agnostic PDE Foundation Models. Paper, GitHub Repo
Highlights:
- We introduce MORPH, a PDE foundation model designed to accommodate heterogeneous data across diverse physical phenomenon
- MORPH is shape-agnostic (1D/2D/3D, varying resolutions, fields with scalar/vector components), with physics-aware channel handling of PDE datasets.
- MORPH employs a larger transformer architecture with one cross-attention and four axial attention modules that attend over a
multi-fold increase in spatiotemporal patches (i.e., a larger context window).
- We pretrain and fine-tune on a broad (3 benchmarks) heterogeneous suite including multi-physics datasets like magnetohydrodynamics (MHD),
turbulent self-gravitating flows with cooling (TGC), high-resolution 2D compressible and incompressible Navier–Stokes and large-scale 3D datasets.
- It's an autoregressive, flexible and powerful backbone for scalable and data-efficient scientific machine learning.
𝗪𝗵𝗮𝘁’𝘀 𝗻𝗲𝘄:
- 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 𝗯𝘂𝗶𝗹𝘁 𝗳𝗼𝗿 𝗽𝗵𝘆𝘀𝗶𝗰𝘀: An autoregressive vision transformer backbone with local convolutions, inter-field cross-attention,
and efficient 4D axial-attentions for global space-time context.
- 𝗢𝗻𝗲 𝗺𝗼𝗱𝗲𝗹, 𝗺𝗮𝗻𝘆 𝘀𝗵𝗮𝗽𝗲𝘀: works across 1D/2D/3D, mixed scalar & vector fields, and varying resolutions without re-architecting — from simple time series
to complex turbulent flows.
- 𝗦𝘁𝗿𝗼𝗻𝗴 𝗿𝗲𝘀𝘂𝗹𝘁𝘀, 𝗹𝗲𝗮𝗻 𝘁𝘂𝗻𝗶𝗻𝗴: beats from-scratch baselines and matches/surpasses recent PDE foundation models. LoRA retains most gains
with far fewer trainable parameters.
Applications:
- General-purpose (task-agnostic) foundation model for PDEs.
- One model, multiple downstream tasks.
- Performs in data and compute-scare scenarios.
Model Variants:
- MORPH-FM-Ti: ~7M with 4 levels of finetuning including LoRA.
- MORPH-FM-S: ~30M with 4 levels of finetuning including LoRA.
- MORPH-FM-M: ~120M with 4 levels of finetuning including LoRA.
- MORPH-FM-L: ~500M with 4 levels of finetuning including LoRA (we prefer LoRA).
- MORPH-FM-XL: 1.2B with 4 levels of finetuning including LoRA (we prefer LoRA).
- MORPH-SS-Ti: ~7M standalone models for 13 datasets.
- MORPH-SS-S: ~30M standalone models for 13 datasets.
Architecture:
- MORPH is built on a convolutional vision transformer backbone that seamlessly handles heterogeneous spatiotemporal
datasets of varying data dimensionality (1D–3D) at different resolutions, multiple
fields with mixed scalar and vector components.
- The architecture combines
- (i) component-wise convolution, which jointly processes scalar and vector channels to capture local interactions,
- (ii) inter-field cross-attention, which models and selectively propagates information between different physical fields,
- (iii) axial attentions, which factorizes full spatiotemporal self-attention along individual spatial
and temporal axes to reduce computational burden while retaining expressivity.
If you use MORPH in your research, please cite:
@misc{rautela2025morphshapeagnosticpdefoundation,
title={{MORPH}: Shape-agnostic {PDE} Foundation Models},
author={Mahindra Singh Rautela and Alexander Most and Siddharth Mansingh and Bradley C. Love and Ayan Biswas and Diane Oyen and Earl Lawrence},
year={2025},
eprint={2509.21670},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.21670}
}