Update README.md
Browse files
README.md
CHANGED
|
@@ -23,7 +23,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 23 |
NuminaMath is a series of language models that are trained with two stages of supervised fine-tuning to solve math problems using chain of thought (CoT) and tool-integrated reasoning (TIR):
|
| 24 |
|
| 25 |
* **Stage 1:** fine-tune the base model on a large, diverse dataset of natural language math problems and solutions, where each solution is templated with Chain of Thought (CoT) to facilitate reasoning.
|
| 26 |
-
* **Stage 2:** fine-tune the model from Stage 1 on a synthetic dataset of tool-integrated reasoning, where each math problem is decomposed into a sequence of rationales, Python programs, and their outputs.
|
| 27 |
|
| 28 |
NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT), a large-scale dataset of 860k+ math competition problem-solution pairs.
|
| 29 |
|
|
@@ -39,7 +39,6 @@ NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaM
|
|
| 39 |
<!-- Provide the basic links for the model. -->
|
| 40 |
|
| 41 |
- **Repository:** https://github.com/project-numina/aimo-progress-prize
|
| 42 |
-
- **Demo:** https://huggingface.co/spaces/AI-MO/math-olympiad-solver
|
| 43 |
|
| 44 |
## Intended uses & limitations
|
| 45 |
|
|
@@ -108,30 +107,14 @@ If you find NuminaMath 7B TIR is useful in your work, please cite it with:
|
|
| 108 |
```
|
| 109 |
@misc{numina_math_7b,
|
| 110 |
author = {Edward Beeching and Shengyi Costa Huang and Albert Jiang and Jia Li and Benjamin Lipkin and Zihan Qina and Kashif Rasul and Ziju Shen and Roman Soletskyi and Lewis Tunstall},
|
| 111 |
-
title = {NuminaMath 7B
|
| 112 |
year = {2024},
|
| 113 |
publisher = {Numina & Hugging Face},
|
| 114 |
journal = {Hugging Face repository},
|
| 115 |
-
howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-
|
| 116 |
}
|
| 117 |
```
|
| 118 |
|
| 119 |
-
This model is a fine-tuned version of [deepseek-ai/deepseek-math-7b-base](https://huggingface.co/deepseek-ai/deepseek-math-7b-base) on the AI-MO/numina-dataset-v1.0-release-candidate-1-preproc dataset.
|
| 120 |
-
It achieves the following results on the evaluation set:
|
| 121 |
-
- Loss: 0.4859
|
| 122 |
-
|
| 123 |
-
## Model description
|
| 124 |
-
|
| 125 |
-
More information needed
|
| 126 |
-
|
| 127 |
-
## Intended uses & limitations
|
| 128 |
-
|
| 129 |
-
More information needed
|
| 130 |
-
|
| 131 |
-
## Training and evaluation data
|
| 132 |
-
|
| 133 |
-
More information needed
|
| 134 |
-
|
| 135 |
## Training procedure
|
| 136 |
|
| 137 |
### Training hyperparameters
|
|
@@ -150,14 +133,6 @@ The following hyperparameters were used during training:
|
|
| 150 |
- lr_scheduler_warmup_ratio: 0.1
|
| 151 |
- num_epochs: 3.0
|
| 152 |
|
| 153 |
-
### Training results
|
| 154 |
-
|
| 155 |
-
| Training Loss | Epoch | Step | Validation Loss |
|
| 156 |
-
|:-------------:|:-----:|:-----:|:---------------:|
|
| 157 |
-
| 0.4814 | 1.0 | 6920 | 0.4942 |
|
| 158 |
-
| 0.4188 | 2.0 | 13840 | 0.4728 |
|
| 159 |
-
| 0.3496 | 3.0 | 20760 | 0.4859 |
|
| 160 |
-
|
| 161 |
|
| 162 |
### Framework versions
|
| 163 |
|
|
|
|
| 23 |
NuminaMath is a series of language models that are trained with two stages of supervised fine-tuning to solve math problems using chain of thought (CoT) and tool-integrated reasoning (TIR):
|
| 24 |
|
| 25 |
* **Stage 1:** fine-tune the base model on a large, diverse dataset of natural language math problems and solutions, where each solution is templated with Chain of Thought (CoT) to facilitate reasoning.
|
| 26 |
+
* **Stage 2:** fine-tune the model from Stage 1 on a synthetic dataset of tool-integrated reasoning, where each math problem is decomposed into a sequence of rationales, Python programs, and their outputs.
|
| 27 |
|
| 28 |
NuminaMath 7B CoT is the model from Stage 1 and was fine-tuned on [AI-MO/NuminaMath-CoT](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT), a large-scale dataset of 860k+ math competition problem-solution pairs.
|
| 29 |
|
|
|
|
| 39 |
<!-- Provide the basic links for the model. -->
|
| 40 |
|
| 41 |
- **Repository:** https://github.com/project-numina/aimo-progress-prize
|
|
|
|
| 42 |
|
| 43 |
## Intended uses & limitations
|
| 44 |
|
|
|
|
| 107 |
```
|
| 108 |
@misc{numina_math_7b,
|
| 109 |
author = {Edward Beeching and Shengyi Costa Huang and Albert Jiang and Jia Li and Benjamin Lipkin and Zihan Qina and Kashif Rasul and Ziju Shen and Roman Soletskyi and Lewis Tunstall},
|
| 110 |
+
title = {NuminaMath 7B CoT},
|
| 111 |
year = {2024},
|
| 112 |
publisher = {Numina & Hugging Face},
|
| 113 |
journal = {Hugging Face repository},
|
| 114 |
+
howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-CoT}}
|
| 115 |
}
|
| 116 |
```
|
| 117 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
## Training procedure
|
| 119 |
|
| 120 |
### Training hyperparameters
|
|
|
|
| 133 |
- lr_scheduler_warmup_ratio: 0.1
|
| 134 |
- num_epochs: 3.0
|
| 135 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
### Framework versions
|
| 138 |
|