dynamix-3d / README.md

Update README.md

a81d305 verified 1 day ago

3.88 kB

	---
	license: mit
	pipeline_tag: time-series-forecasting
	datasets:
	- williamgilpin/dysts
	---

	# DynaMix-3D

	[![arXiv](https://img.shields.io/badge/arXiv-2505.13192-b31b1b.svg)](https://arxiv.org/abs/2505.13192)

	DynaMix is a foundation model for zero-shot inference of dynamical systems that preserves long-term statistics. Unlike traditional approaches that require retraining for each new system, DynaMix generalizes across dynamical systems by learning universal representations that capture the underlying patterns governing temporal evolution.

	- Accurate Zero-Shot DSR: DynaMix generalizes across diverse dynamical systems without fine-tuning, accurately capturing attractor geometry and long-term statistics.
	- Context Felxible Dynamics Modeling: The multivariate architecture captures dependencies across system dimensions and adapts flexibly to different dimensionalities (for this model up to 3) and context lengths.
	- Efficient and Lightweight: Designed to be efficient with ~10K parameters, DynaMix can run on CPU for inference, enabling orders-of-magnitude faster inference than traditional foundation models.
	- Interpretable Dynamics: Provides insights into the structure of reconstructed systems, revealing similarities across different dynamical systems.
	- General Time Series Forecasting: Extends beyond DSR to general time series forecasting using adaptable embedding techniques.

	For complete documentation and code, visit the [DynaMix repository](https://github.com/yourusername/zero-shot-DSR).

	## Model Description

	DynaMix is based on a sparse mixture of experts (MoE) architecture operating in latent space:

	1. Expert Networks: Each expert is a specialized dynamical model, given through Almost-Linear Recurrent Neural Networks

	2. Gating Network: Selects experts based on the provided context and current latent representation of the dynamics

	By aggregating the expert weighting with the expert prediction the next state is predicted. The current model has the following specifics:

	- M (Latent state dimension): 30
	- N (Observation space dimension): 3
	- Experts: 10 expert networks in the mixture
	- Expert type: `"almost_linear_rnn"` — a compact recurrent model combining linear and nonlinear components (`P=2` ReLU units)
	- Probabilistic expert: `False` (deterministic outputs; probabilistic Gaussian outputs optional)

	## Usage

	To load the model in python using the corresponding codebase [DynaMix repository](https://github.com/yourusername/zero-shot-DSR), use:

	```python
	from src.model.dynamix import DynaMix
	from huggingface_hub import hf_hub_download
	from safetensors.torch import load_file

	# Initialize model with architecture
	model = DynaMix(M=M, N=N, Experts=EXPERTS, expert_type=EXPERT_TYPE, P=P)

	# Load model weights
	model_path = hf_hub_download(
	repo_id="DurstewitzLab/dynamix-3d",
	filename="dynamix-3d-base-v1.0.safetensors"
	)
	model_state_dict = load_file(model_path)
	model.load_state_dict(model_state_dict)

	# Set model to evaluation mode
	model.eval()
	```

	Given context data from the target system with shape (`T_C`, `S`, `N`) (where `T_C` is the context length, `S` the number of sequences that should get processed and `N` the data dimensionality), generate forecasts by passing the data through the `DynaMixForecaster` along with the loaded model. Further details can be found in the GitHub repository [DynaMix repository](https://github.com/yourusername/zero-shot-DSR).


	## Citation

	If you use DynaMix in your research, please cite our paper:

	```
	@misc{hemmer2025truezeroshotinferencedynamical,
	title={True Zero-Shot Inference of Dynamical Systems Preserving Long-Term Statistics},
	author={Christoph Jürgen Hemmer and Daniel Durstewitz},
	year={2025},
	eprint={2505.13192},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/abs/2505.13192},
	}
	```