Toto-Open-Base-1.0 / README.md

Update README.md

799306e verified 6 months ago

5.03 kB

	---
	model_id: Toto-Open-Base-1.0
	tags:
	- time-series-forecasting
	- foundation models
	- pretrained models
	- time series foundation models
	- time series
	- time-series
	- transformers
	- forecasting
	- safetensors
	- observability
	paper:
	- - Link to Paper
	datasets:
	- Salesforce/GiftEvalPretrain
	- autogluon/chronos_datasets
	leaderboards:
	- GiftEval (if results are public)#TODO(Anna) check how to do that
	- BOOM (if results are public)#TODO(Anna) check how to do that
	license: apache-2.0
	pipeline_tag: time-series-forecasting
	---
	# Toto-Open-Base-1.0

	Toto (Time Series Optimized Transformer for [Observability](https://www.datadoghq.com/knowledge-center/observability/) is a time-series foundation model designed for multi-variate time series forecasting, emphasizing observability metrics. Toto efficiently handles high-dimensional, sparse, and non-stationary data commonly encountered in observability scenarios.


	<div style="width: 100%; margin: auto; padding: 1rem;">
	<img src="figures/architecture.png" alt="model architecture" style="width: 100%; height: auto;" />
	<em style="display: block; margin-top: 0.5rem; text-align: center;">
	Overview of Toto-Open-Base-1.0 architecture.
	</em>
	</div>

	---

	## ⚡ Quick Start: Model Inference

	Inference code is available on [GitHub](https://github.com/DataDog/toto).

	### Installation

	```bash
	# Clone the repository
	git clone https://github.com/DataDog/toto.git
	cd toto

	# Install dependencies
	pip install -r requirements.txt
	```

	### 🚀 Inference Example

	Here's how to quickly generate forecasts using Toto:

	```python
	import torch
	from data.util.dataset import MaskedTimeseries
	from inference.forecaster import TotoForecaster
	from model.toto import Toto

	DEVICE = 'cuda'

	# Load pre-trained Toto model
	toto = Toto.from_pretrained('Datadog/Toto-Open-Base-1.0').to(DEVICE)

	# Optional: compile model for enhanced speed
	toto.compile()

	forecaster = TotoForecaster(toto.model)

	# Example input series (7 variables, 4096 timesteps)
	input_series = torch.randn(7, 4096).to(DEVICE)
	timestamp_seconds = torch.zeros(7, 4096).to(DEVICE)
	time_interval_seconds = torch.full((7,), 60*15).to(DEVICE)

	inputs = MaskedTimeseries(
	series=input_series,
	padding_mask=torch.full_like(input_series, True, dtype=torch.bool),
	id_mask=torch.zeros_like(input_series),
	timestamp_seconds=timestamp_seconds,
	time_interval_seconds=time_interval_seconds,
	)

	# Generate forecasts for next 336 timesteps
	forecast = forecaster.forecast(
	inputs,
	prediction_length=336,
	num_samples=256,
	samples_per_batch=256,
	)

	# Access results
	mean_prediction = forecast.mean
	prediction_samples = forecast.samples
	lower_quantile = forecast.quantile(0.1)
	upper_quantile = forecast.quantile(0.9)
	```

	For detailed inference instructions, refer to the [inference tutorial notebook](https://github.com/DataDog/toto/blob/main/toto/notebooks/inference_tutorial.ipynb).

	### Performance Recommendations
	- ### For optimal speed and reduced memory usage, install [xFormers](https://github.com/facebookresearch/xformers) and [flash-attention](https://github.com/Dao-AILab/flash-attention). Then, set `use_memory_efficient` to `True`.

	---

	### 💾 Available Checkpoints

	\| Checkpoint \| Parameters \| Config \| Size \| Notes \|
	\|------------\|------------\|--------\|------\|-------\|
	\| [Toto-Open-Base-1.0](https://huggingface.co/Datadog/Toto-Open-Base-1.0/blob/main/model.safetensors) \| 151M \| [Config](https://huggingface.co/Datadog/Toto-Open-Base-1.0/blob/main/config.json) \| 605 MB \| Initial release with SOTA performance \|


	## ✨ Key Features

	- Zero-Shot Forecasting
	- Multi-Variate Support
	- Decoder-Only Transformer Architecture
	- Probabilistic Predictions (Student-T mixture model)
	- Causal Patch-Wise Instance Normalization
	- Extensive Pretraining on Large-Scale Data
	- High-Dimensional Time Series Support
	- Tailored for Observability Metrics
	- State-of-the-Art Performance on [GiftEval](https://huggingface.co/spaces/Salesforce/GIFT-Eval) and [BOOM](https://huggingface.co/datasets/Datadog/BOOM)

	---

	## 📚 Training Data Summary

	- Observability Metrics: ~1 trillion points from Datadog internal systems (no customer data)
	- Public Datasets:
	- [GiftEval Pretrain](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain)
	- [Chronos datasets](https://huggingface.co/datasets/autogluon/chronos_datasets)
	- Synthetic Data: ~1/3 of training data

	---

	## 🔗 Additional Resources

	- Research Paper (To add)
	- [GitHub Repository](https://github.com/DataDog/toto.git)
	- [Blog Post](#TODO-link-to-blogpost)
	- [BOOM Dataset](https://huggingface.co/datasets/Datadog/BOOM)

	---


	## 📖 Citation
	If you use Toto in your research or applications, please cite us using the following:

	```bibtex
	@misc{toto2025,
	title={This Time is Different: An Observability Perspective on Time Series Foundation Models},
	author={TODO},
	year={2025},
	eprint={arXiv:TODO},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}

	```