Update README.md
Browse files
README.md
CHANGED
@@ -197,6 +197,7 @@ Evaluation results of non reasoning models and reasoning models in no thinking m
|
|
197 |
| Alignment | MixEval Hard | 26.9 | <u>27.6</u> | 24.9 | 24.3 | **31.6** |
|
198 |
| Tool Calling | BFCL| <u>92.3</u> | - | <u>92.3</u> * | 89.5 | **95.0** |
|
199 |
| Multilingual Q&A | Global MMLU | <u>53.5</u> | 50.54 | 46.8 | 49.5 | **65.1** |
|
|
|
200 |
(*): this is a tool calling finetune
|
201 |
|
202 |
### Extended Thinking
|
|
|
197 |
| Alignment | MixEval Hard | 26.9 | <u>27.6</u> | 24.9 | 24.3 | **31.6** |
|
198 |
| Tool Calling | BFCL| <u>92.3</u> | - | <u>92.3</u> * | 89.5 | **95.0** |
|
199 |
| Multilingual Q&A | Global MMLU | <u>53.5</u> | 50.54 | 46.8 | 49.5 | **65.1** |
|
200 |
+
|
201 |
(*): this is a tool calling finetune
|
202 |
|
203 |
### Extended Thinking
|