nm-research commited on
Commit
2178b27
·
verified ·
1 Parent(s): 2c038bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -189,16 +189,18 @@ evalplus.evaluate \
189
 
190
  | Metric | ibm-granite/granite-3.1-8b-instruct | neuralmagic-ent/granite-3.1-8b-instruct-quantized.w4a16 |
191
  |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
192
- | ARC-Challenge (Acc-Norm, 25-shot) | | |
193
- | GSM8K (Strict-Match, 5-shot) | | |
194
- | HellaSwag (Acc-Norm, 10-shot) | | |
195
- | MMLU (Acc, 5-shot) | | |
196
- | TruthfulQA (MC2, 0-shot) | | |
197
- | Winogrande (Acc, 5-shot) | | |
198
- | **Average Score** | **** | **** |
199
- | **Recovery** | **100.00** | **** |
200
 
201
  #### HumanEval pass@1 scores
202
-
 
 
203
 
204
 
 
189
 
190
  | Metric | ibm-granite/granite-3.1-8b-instruct | neuralmagic-ent/granite-3.1-8b-instruct-quantized.w4a16 |
191
  |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
192
+ | ARC-Challenge (Acc-Norm, 25-shot) | 66.81 | 66.98 |
193
+ | GSM8K (Strict-Match, 5-shot) | 64.52 | 68.08 |
194
+ | HellaSwag (Acc-Norm, 10-shot) | 84.18 | 83.30 |
195
+ | MMLU (Acc, 5-shot) | 65.52 | 63.96 |
196
+ | TruthfulQA (MC2, 0-shot) | 60.57 | 60.62 |
197
+ | Winogrande (Acc, 5-shot) | 80.19 | 78.61 |
198
+ | **Average Score** | **70.30** | **70.26** |
199
+ | **Recovery** | **100.00** | **99.94** |
200
 
201
  #### HumanEval pass@1 scores
202
+ | Metric | ibm-granite/granite-3.1-8b-instruct | neuralmagic-ent/granite-3.1-8b-instruct-quantized.w4a16 |
203
+ |-----------------------------------------|:---------------------------------:|:-------------------------------------------:|
204
+ | HumanEval Pass@1 | 71.00 | 70.90 |
205
 
206