| ## Introduction | |
| This is the evaluation script used to reproduce math benchmarks scores for AceMath-1.5B/7B/72B-Instruct models based on their outputs. The benchmark can be downloaded from [Qwen2.5-Math](https://github.com/QwenLM/Qwen2.5-Math/tree/main/evaluation/data). | |
| ## Calculate Scores | |
| ```console | |
| python calculate_scores.py | |
| ``` | |