Toto-Open-Base-1.0 / README.md
Emaad's picture
Update README.md
799306e verified
|
raw
history blame
5.03 kB
---
model_id: Toto-Open-Base-1.0
tags:
- time-series-forecasting
- foundation models
- pretrained models
- time series foundation models
- time series
- time-series
- transformers
- forecasting
- safetensors
- observability
paper:
- - Link to Paper
datasets:
- Salesforce/GiftEvalPretrain
- autogluon/chronos_datasets
leaderboards:
- GiftEval (if results are public)#TODO(Anna) check how to do that
- BOOM (if results are public)#TODO(Anna) check how to do that
license: apache-2.0
pipeline_tag: time-series-forecasting
---
# Toto-Open-Base-1.0
Toto (Time Series Optimized Transformer for [Observability](https://www.datadoghq.com/knowledge-center/observability/) is a time-series foundation model designed for multi-variate time series forecasting, emphasizing observability metrics. Toto efficiently handles high-dimensional, sparse, and non-stationary data commonly encountered in observability scenarios.
<div style="width: 100%; margin: auto; padding: 1rem;">
<img src="figures/architecture.png" alt="model architecture" style="width: 100%; height: auto;" />
<em style="display: block; margin-top: 0.5rem; text-align: center;">
Overview of Toto-Open-Base-1.0 architecture.
</em>
</div>
---
## ⚡ Quick Start: Model Inference
Inference code is available on [GitHub](https://github.com/DataDog/toto).
### Installation
```bash
# Clone the repository
git clone https://github.com/DataDog/toto.git
cd toto
# Install dependencies
pip install -r requirements.txt
```
### 🚀 Inference Example
Here's how to quickly generate forecasts using Toto:
```python
import torch
from data.util.dataset import MaskedTimeseries
from inference.forecaster import TotoForecaster
from model.toto import Toto
DEVICE = 'cuda'
# Load pre-trained Toto model
toto = Toto.from_pretrained('Datadog/Toto-Open-Base-1.0').to(DEVICE)
# Optional: compile model for enhanced speed
toto.compile()
forecaster = TotoForecaster(toto.model)
# Example input series (7 variables, 4096 timesteps)
input_series = torch.randn(7, 4096).to(DEVICE)
timestamp_seconds = torch.zeros(7, 4096).to(DEVICE)
time_interval_seconds = torch.full((7,), 60*15).to(DEVICE)
inputs = MaskedTimeseries(
series=input_series,
padding_mask=torch.full_like(input_series, True, dtype=torch.bool),
id_mask=torch.zeros_like(input_series),
timestamp_seconds=timestamp_seconds,
time_interval_seconds=time_interval_seconds,
)
# Generate forecasts for next 336 timesteps
forecast = forecaster.forecast(
inputs,
prediction_length=336,
num_samples=256,
samples_per_batch=256,
)
# Access results
mean_prediction = forecast.mean
prediction_samples = forecast.samples
lower_quantile = forecast.quantile(0.1)
upper_quantile = forecast.quantile(0.9)
```
For detailed inference instructions, refer to the [inference tutorial notebook](https://github.com/DataDog/toto/blob/main/toto/notebooks/inference_tutorial.ipynb).
### Performance Recommendations
- ### **For optimal speed and reduced memory usage, install [xFormers](https://github.com/facebookresearch/xformers) and [flash-attention](https://github.com/Dao-AILab/flash-attention). Then, set `use_memory_efficient` to `True`.**
---
### 💾 Available Checkpoints
| Checkpoint | Parameters | Config | Size | Notes |
|------------|------------|--------|------|-------|
| [Toto-Open-Base-1.0](https://huggingface.co/Datadog/Toto-Open-Base-1.0/blob/main/model.safetensors) | 151M | [Config](https://huggingface.co/Datadog/Toto-Open-Base-1.0/blob/main/config.json) | 605 MB | Initial release with SOTA performance |
## ✨ Key Features
- **Zero-Shot Forecasting**
- **Multi-Variate Support**
- **Decoder-Only Transformer Architecture**
- **Probabilistic Predictions (Student-T mixture model)**
- **Causal Patch-Wise Instance Normalization**
- **Extensive Pretraining on Large-Scale Data**
- **High-Dimensional Time Series Support**
- **Tailored for Observability Metrics**
- **State-of-the-Art Performance** on [GiftEval](https://huggingface.co/spaces/Salesforce/GIFT-Eval) and [BOOM](https://huggingface.co/datasets/Datadog/BOOM)
---
## 📚 Training Data Summary
- **Observability Metrics:** ~1 trillion points from Datadog internal systems (no customer data)
- **Public Datasets:**
- [GiftEval Pretrain](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain)
- [Chronos datasets](https://huggingface.co/datasets/autogluon/chronos_datasets)
- **Synthetic Data:** ~1/3 of training data
---
## 🔗 Additional Resources
- **Research Paper (To add)**
- **[GitHub Repository](https://github.com/DataDog/toto.git)**
- **[Blog Post](#TODO-link-to-blogpost)**
- **[BOOM Dataset](https://huggingface.co/datasets/Datadog/BOOM)**
---
## 📖 Citation
If you use Toto in your research or applications, please cite us using the following:
```bibtex
@misc{toto2025,
title={This Time is Different: An Observability Perspective on Time Series Foundation Models},
author={TODO},
year={2025},
eprint={arXiv:TODO},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```