Update README.md
Browse files
README.md
CHANGED
@@ -105,7 +105,7 @@ Please use "Please reason step by step, and put your final answer within \boxed{
|
|
105 |
#### System‑Level Safety:
|
106 |
- The model is designed to be deployed as part of a broader system that implements safety measures (e.g., Prompt Guard, Code Shield) to ensure outputs remain safe even under adversarial conditions.
|
107 |
|
108 |
-
---
|
109 |
|
110 |
### Safety Fine‑Tuning & Data Strategy
|
111 |
|
@@ -181,6 +181,11 @@ hf (pretrained=EpistemeAI/ReasoningCore-3B-R01), gen_kwargs: (None), limit: None
|
|
181 |
|gpqa_diamond_zeroshot| 1|none | 0|acc |↑ |0.3182|± |0.0332|
|
182 |
| | |none | 0|acc_norm|↑ |0.3182|± |0.0332|
|
183 |
|
|
|
|
|
|
|
|
|
|
|
184 |
|
185 |
# Uploaded model
|
186 |
|
|
|
105 |
#### System‑Level Safety:
|
106 |
- The model is designed to be deployed as part of a broader system that implements safety measures (e.g., Prompt Guard, Code Shield) to ensure outputs remain safe even under adversarial conditions.
|
107 |
|
108 |
+
---s
|
109 |
|
110 |
### Safety Fine‑Tuning & Data Strategy
|
111 |
|
|
|
181 |
|gpqa_diamond_zeroshot| 1|none | 0|acc |↑ |0.3182|± |0.0332|
|
182 |
| | |none | 0|acc_norm|↑ |0.3182|± |0.0332|
|
183 |
|
184 |
+
hf (pretrained=EpistemeAI/ReasoningCore-3B-R01), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 8
|
185 |
+
| Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
186 |
+
|------------------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|
187 |
+
|gsm8k_cot_zeroshot| 3|flexible-extract| 0|exact_match|↑ |0.3154|± |0.0128|
|
188 |
+
| | |strict-match | 0|exact_match|↑ |0.2873|± |0.0125|
|
189 |
|
190 |
# Uploaded model
|
191 |
|