|
--- |
|
license: mit |
|
pipeline_tag: time-series-forecasting |
|
datasets: |
|
- williamgilpin/dysts |
|
--- |
|
|
|
# DynaMix-3D |
|
|
|
[](https://arxiv.org/abs/2505.13192) |
|
|
|
DynaMix is a foundation model for zero-shot inference of dynamical systems that preserves long-term statistics. Unlike traditional approaches that require retraining for each new system, DynaMix generalizes across dynamical systems by learning universal representations that capture the underlying patterns governing temporal evolution. |
|
|
|
- **Accurate Zero-Shot DSR**: DynaMix generalizes across diverse dynamical systems without fine-tuning, accurately capturing attractor geometry and long-term statistics. |
|
- **Context Felxible Dynamics Modeling**: The multivariate architecture captures dependencies across system dimensions and adapts flexibly to different dimensionalities (for this model up to 3) and context lengths. |
|
- **Efficient and Lightweight**: Designed to be efficient with ~10K parameters, DynaMix can run on CPU for inference, enabling orders-of-magnitude faster inference than traditional foundation models. |
|
- **Interpretable Dynamics**: Provides insights into the structure of reconstructed systems, revealing similarities across different dynamical systems. |
|
- **General Time Series Forecasting**: Extends beyond DSR to general time series forecasting using adaptable embedding techniques. |
|
|
|
For complete documentation and code, visit the [DynaMix repository](https://github.com/yourusername/zero-shot-DSR). |
|
|
|
## Model Description |
|
|
|
DynaMix is based on a sparse mixture of experts (MoE) architecture operating in latent space: |
|
|
|
1. **Expert Networks**: Each expert is a specialized dynamical model, given through Almost-Linear Recurrent Neural Networks |
|
|
|
2. **Gating Network**: Selects experts based on the provided context and current latent representation of the dynamics |
|
|
|
By aggregating the expert weighting with the expert prediction the next state is predicted. The current model has the following specifics: |
|
|
|
- **M (Latent state dimension):** 30 |
|
- **N (Observation space dimension):** 3 |
|
- **Experts:** 10 expert networks in the mixture |
|
- **Expert type:** `"almost_linear_rnn"` — a compact recurrent model combining linear and nonlinear components (`P=2` ReLU units) |
|
- **Probabilistic expert:** `False` (deterministic outputs; probabilistic Gaussian outputs optional) |
|
|
|
## Usage |
|
|
|
To load the model in python using the corresponding codebase [DynaMix repository](https://github.com/yourusername/zero-shot-DSR), use: |
|
|
|
```python |
|
from src.model.dynamix import DynaMix |
|
from huggingface_hub import hf_hub_download |
|
from safetensors.torch import load_file |
|
|
|
# Initialize model with architecture |
|
model = DynaMix(M=M, N=N, Experts=EXPERTS, expert_type=EXPERT_TYPE, P=P) |
|
|
|
# Load model weights |
|
model_path = hf_hub_download( |
|
repo_id="DurstewitzLab/dynamix-3d", |
|
filename="dynamix-3d-base-v1.0.safetensors" |
|
) |
|
model_state_dict = load_file(model_path) |
|
model.load_state_dict(model_state_dict) |
|
|
|
# Set model to evaluation mode |
|
model.eval() |
|
``` |
|
|
|
Given context data from the target system with shape (`T_C`, `S`, `N`) (where `T_C` is the context length, `S` the number of sequences that should get processed and `N` the data dimensionality), generate forecasts by passing the data through the `DynaMixForecaster` along with the loaded model. Further details can be found in the GitHub repository [DynaMix repository](https://github.com/yourusername/zero-shot-DSR). |
|
|
|
|
|
## Citation |
|
|
|
If you use DynaMix in your research, please cite our paper: |
|
|
|
``` |
|
@misc{hemmer2025truezeroshotinferencedynamical, |
|
title={True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics}, |
|
author={Christoph Jürgen Hemmer and Daniel Durstewitz}, |
|
year={2025}, |
|
eprint={2505.13192}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.LG}, |
|
url={https://arxiv.org/abs/2505.13192}, |
|
} |
|
``` |