Open R1

Team

community

https://github.com/huggingface/open-r1

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

lewtun updated a model 11 days ago

open-r1/OlympicCoder-32B

lewtun updated a model 11 days ago

open-r1/OlympicCoder-7B

clefourrier authored a paper 29 days ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

View all activity

Articles

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Jan 31, 2025

•

View all articles

Organization Card

Community About org cards

Welcome to Open-R1 🐳🤗

Open-R1 is an open initiative to replicate and extend the techniques behind DeepSeek-R1, a state-of-the-art reasoning model, in a fully transparent and collaborative way: https://github.com/huggingface/open-r1

This organization is dedicated to:

Sharing datasets and models built on the path to replicating DeepSeek-R1.
Fostering meaningful discussions and collaboration in the Community tab.

By working together, we aim to create a robust foundation for reasoning models that the entire research and industry community can leverage.

Plan of attack

We are using the DeepSeek-R1 tech report as a guide to recreate their pipeline. The work can be broken down into three main steps:

Replicate R1-Distill: Distill a high-quality reasoning corpus from DeepSeek-R1 to create the R1-Distill models.
Recreate the pure RL pipeline: Reproduce the reinforcement learning process that DeepSeek used to train R1-Zero. This will likely require curating new, large-scale datasets for math, reasoning, and code.
Demonstrate end-to-end training: Show that we can go from a base model to RL-tuned reasoning capabilities through a multi-stage training approach, combining supervised fine-tuning (SFT) and reinforcement learning (RL).

How to contribute

This project thrives on community participation! Here are some ways you can contribute:

Join the discussion: Share ideas, ask questions, and collaborate with others in the Community tab.
Contribute code or datasets: Submit pull requests with datasets, models, or improvements to the pipeline.
Experiment and share results: Try out different approaches and share your findings with the community.

Let’s build something impactful together. 🚀