Update README.md
Browse files
README.md
CHANGED
|
@@ -143,12 +143,23 @@ print(make_table(results))
|
|
| 143 |
|----------------------------------|-------------|-------------------|
|
| 144 |
| | Phi-4 mini-Ins | phi4-mini-8dq4w |
|
| 145 |
| **Popular aggregated benchmark** | | |
|
| 146 |
-
| mmlu | 66.73
|
| 147 |
-
| mmlu_pro | 44.71
|
| 148 |
| **Reasoning** | | |
|
| 149 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 150 |
| **Multilingual** | | |
|
|
|
|
|
|
|
| 151 |
| **Math** | | |
|
|
|
|
|
|
|
| 152 |
| **Overall** | **TODO** | **TODO** |
|
| 153 |
|
| 154 |
|
|
|
|
| 143 |
|----------------------------------|-------------|-------------------|
|
| 144 |
| | Phi-4 mini-Ins | phi4-mini-8dq4w |
|
| 145 |
| **Popular aggregated benchmark** | | |
|
| 146 |
+
| mmlu | 66.73 | 63.11 |
|
| 147 |
+
| mmlu_pro | 44.71 | 35.31 |
|
| 148 |
| **Reasoning** | | |
|
| 149 |
+
| arc_challenge | TODO | TODO |
|
| 150 |
+
| gpqa | TODO | TODO |
|
| 151 |
+
| hellaswag | 54.57 | 53.24 |
|
| 152 |
+
| openbookqa | TODO | TODO |
|
| 153 |
+
| piqa | TODO | TODO |
|
| 154 |
+
| siqa | TODO | TODO |
|
| 155 |
+
| truthfulqa | TODO | TODO |
|
| 156 |
+
| winogrande | TODO | TODO |
|
| 157 |
| **Multilingual** | | |
|
| 158 |
+
| Mgsm | TODO | TODO |
|
| 159 |
+
| mgsm_cot_native | TODO | TODO |
|
| 160 |
| **Math** | | |
|
| 161 |
+
| gsm8k | TODO | TODO |
|
| 162 |
+
| Mathqa | TODO | TODO |
|
| 163 |
| **Overall** | **TODO** | **TODO** |
|
| 164 |
|
| 165 |
|