Safetensors
English
qwen3
Suu commited on
Commit
bd25c8d
·
verified ·
1 Parent(s): a742022

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -117,11 +117,19 @@ YOUR_TEST_FILE="<test_data_path>"
117
  ```
118
 
119
  ### Evaluation
120
- When we expand the inference budget to 64K and adopt **the YaRN method with a scaling factor of 2.5**. **Evaluation is coming soon, stay tuned.**
121
 
122
  The evaluation data for AIME24, AIME25, and HMMT2025 are available in our GitHub repository under the **benchmarks directory**.
123
  For LiveCodeBench, please download the data from the official website.
124
 
 
 
 
 
 
 
 
 
125
  ## 🤝 Citation
126
  If you find this work helpful, please cite our paper:
127
  ```bibtex
 
117
  ```
118
 
119
  ### Evaluation
120
+ When we expand the inference budget to 64K and adopt **the YaRN method with a scaling factor of 2.5**.
121
 
122
  The evaluation data for AIME24, AIME25, and HMMT2025 are available in our GitHub repository under the **benchmarks directory**.
123
  For LiveCodeBench, please download the data from the official website.
124
 
125
+ You can run the following commands to perform inference and evaluation:
126
+ ```bash
127
+ git clone https://github.com/suu990901/KlearReasoner
128
+ cd KlearReasoner/benchmarks
129
+ python inference.py --model <KlearReasoner-8B_path> --n 64 --dataset_path ./benchmarks/aime24.qs.jsonl
130
+ python judge_math.py <path_to_inference_results>
131
+ ```
132
+
133
  ## 🤝 Citation
134
  If you find this work helpful, please cite our paper:
135
  ```bibtex