Add metadata, link to code
Browse filesThis PR adds relevant metadata to improve model discoverability. It also includes a concise model description and links to the paper and code.
README.md
CHANGED
|
@@ -1,3 +1,11 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: question-answering
|
| 4 |
+
library_name: transformers
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
This repository contains the model described in the paper [SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild](https://arxiv.org/abs/2503.18892).
|
| 8 |
+
|
| 9 |
+
This model has been trained with a simple reinforcement learning (RL) recipe to improve reasoning abilities. Training starts from base models and uses rule-based rewards and the GSM8K/Math datasets. This approach has been successfully applied to diverse base models with limited data (8K examples), achieving significant accuracy gains ranging from 10 to more than 20 absolute points.
|
| 10 |
+
|
| 11 |
+
Code: https://github.com/hkust-nlp/simpleRL-reason/tree/v1
|