-
Magpie-Align/Magpie-Reasoning-V1-150K
Viewer • Updated • 150k • 130 • 55 -
Magpie-Align/Magpie-Reasoning-V1-150K-CoT-QwQ
Viewer • Updated • 150k • 97 • 8 -
Magpie-Align/Magpie-Reasoning-V1-150K-CoT-Skywork-O1-Llama-3.1-8B
Viewer • Updated • 150k • 17 • 1 -
Magpie-Align/Magpie-Reasoning-V1-150K-CoT-Deepseek-R1-Llama-70B
Viewer • Updated • 150k • 82 • 17
Magpie Alignment
AI & ML interests
Transparent LLM alignment for all.
Hi, I am Magpie 🐦, your efficient and high-quality synthetic data generation pipeline!
🕸️ Project Website: https://magpie-align.github.io/
📄 Arxiv: https://arxiv.org/abs/2406.08464 (Accepted by ICLR 2025)
🤗 HF Paper Page: https://huggingface.co/papers/2406.08464
😬 Codes: https://github.com/magpie-align/magpie
🤗 Magpie Demo: https://huggingface.co/spaces/davanstrien/magpie (Thanks a lot for the implementation from @davanstrien!)
🐦 MagpieLM: MagpieLM-4B, MagpieLM-8B
Questions? Please contact Zhangchen and/or Yuchen by email or raise an issue in Github.
🧭 Click here for full dataset navigation (SFT and DPO)
Recommended Filtered Datasets
Here are some filtered datasets made by the authors, which are utilized in our Magpie-Align models. We also encourage you to create and apply your own filters to customize datasets.
We've kept these datasets within the 200K-300K range for your convenience. We found this range represents a sweet spot balancing model performance and training time.
The full list of filtered datasets can be found here.
Model Name | Dataset | Size | Type | Description |
---|---|---|---|---|
Llama 3.1 70B Instruct | Magpie-Llama-3.1-Pro-MT-300K-Filtered | 300K | SFT | (🌟 Flexible License! 🌟) Select 300K high quality multi-turn conversations from Magpie-Llama-3.1-Pro-MT-500K. |
Llama 3 70B Instruct | Magpie-Pro-300K-Filtered | 300K | SFT | Apply a filter and select 300K high quality conversations from Magpie-Pro-1M. |
Llama 3 70B Instruct | Magpie-Pro-MT-300K | 300K | SFT | Select 300K difficult questions from Magpie-Pro-1M and extend to multi-turn conversations. |
Llama 3 70B Instruct | Magpie-Reasoning-150K | 150K | SFT | Reasoning booster with 150K math + code + reasoning conversations. Recommend mixing with Magpie-Pro-MT-300K. |
Qwen2 72B Instruct | Magpie-Qwen2-Pro-200K-Chinese | 200K | SFT | Apply a filter and select 200K high quality Chinese conversations from Magpie-Qwen2-Pro-1M. |
Gemma-2-27b-it | Magpie-Gemma2-Pro-200K-Filtered | 200K | SFT | (🌟 Flexible License! 🌟) Apply a filter and select 200K conversations from Magpie-Gemma2-Pro-534K. |
Llama 3 8B Instruct | Magpie-Air-DPO-100K | 100K | DPO | DPO dataset via Best-of-N sampling and rewards. |
-
Magpie-Align/Magpie-Reasoning-V1-150K
Viewer • Updated • 150k • 130 • 55 -
Magpie-Align/Magpie-Reasoning-V1-150K-CoT-QwQ
Viewer • Updated • 150k • 97 • 8 -
Magpie-Align/Magpie-Reasoning-V1-150K-CoT-Skywork-O1-Llama-3.1-8B
Viewer • Updated • 150k • 17 • 1 -
Magpie-Align/Magpie-Reasoning-V1-150K-CoT-Deepseek-R1-Llama-70B
Viewer • Updated • 150k • 82 • 17
models
27

Magpie-Align/MagpieLM-4B-Chat-v0.1

Magpie-Align/MagpieLM-8B-Chat-v0.1

Magpie-Align/MagpieLM-8B-SFT-v0.1

Magpie-Align/MagpieLM-4B-SFT-v0.1

Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.1

Magpie-Align/Llama-3.1-8B-Magpie-Align-SFT-v0.1

Magpie-Align/Llama-3-8B-Magpie-Pro-SFT-300K-v0.1

Magpie-Align/Llama-3-8B-Self-Instruct-100K

Magpie-Align/Llama-3.1-8B-Magpie-Align-v0.1
