AI & ML interests
None defined yet.
Recent Activity
View all activity
Articles
Organization Card
Welcome to Open-R1 🐳🤗
Open-R1 is an open initiative to replicate and extend the techniques behind DeepSeek-R1, a state-of-the-art reasoning model, in a fully transparent and collaborative way: https://github.com/huggingface/open-r1
This organization is dedicated to:
- Sharing datasets and models built on the path to replicating DeepSeek-R1.
- Fostering meaningful discussions and collaboration in the Community tab.
By working together, we aim to create a robust foundation for reasoning models that the entire research and industry community can leverage.
Plan of attack
We are using the DeepSeek-R1 tech report as a guide to recreate their pipeline. The work can be broken down into three main steps:
- Replicate R1-Distill: Distill a high-quality reasoning corpus from DeepSeek-R1 to create the R1-Distill models.
- Recreate the pure RL pipeline: Reproduce the reinforcement learning process that DeepSeek used to train R1-Zero. This will likely require curating new, large-scale datasets for math, reasoning, and code.
- Demonstrate end-to-end training: Show that we can go from a base model to RL-tuned reasoning capabilities through a multi-stage training approach, combining supervised fine-tuning (SFT) and reinforcement learning (RL).
How to contribute
This project thrives on community participation! Here are some ways you can contribute:
- Join the discussion: Share ideas, ask questions, and collaborate with others in the Community tab.
- Contribute code or datasets: Submit pull requests with datasets, models, or improvements to the pipeline.
- Experiment and share results: Try out different approaches and share your findings with the community.
Let’s build something impactful together. 🚀
models
5
open-r1/OpenR1-Qwen-7B
Text Generation
•
8B
•
Updated
•
57
•
•
54
open-r1/OpenR1-Distill-7B
Text Generation
•
8B
•
Updated
•
130
•
•
22
open-r1/Qwen2.5-Math-7B-RoPE-300k
Text Generation
•
8B
•
Updated
•
505
•
•
5
open-r1/OlympicCoder-32B
Text Generation
•
33B
•
Updated
•
59
•
•
154
open-r1/OlympicCoder-7B
Text Generation
•
8B
•
Updated
•
256
•
•
181
datasets
22
open-r1/DAPO-Math-17k-Processed
Viewer
•
Updated
•
34.8k
•
5.66k
•
53
open-r1/Mixture-of-Thoughts
Viewer
•
Updated
•
699k
•
4.78k
•
292
open-r1/details-open-r1_OpenR1-Distill-7B
Viewer
•
Updated
•
859
•
53
•
1
open-r1/codeforces
Viewer
•
Updated
•
34.8k
•
10.5k
•
83
open-r1/codeforces-submissions
Viewer
•
Updated
•
12.7M
•
589
•
8
open-r1/Big-Math-RL-Verified-Processed
Viewer
•
Updated
•
1M
•
1.13k
•
25
open-r1/codeforces-cots
Viewer
•
Updated
•
254k
•
3.83k
•
198
open-r1/verifiable-coding-problems-python_decontaminated-tested-shuffled
Viewer
•
Updated
•
15.1k
•
323
•
2
open-r1/verifiable-coding-problems-python_decontaminated-tested
Viewer
•
Updated
•
15.1k
•
171
open-r1/ioi-test-cases
Viewer
•
Updated
•
4.24k
•
191
•
3