Update README.md
Browse files
README.md
CHANGED
|
@@ -57,7 +57,7 @@ print(outputs[0]["generated_text"][-1])
|
|
| 57 |
|
| 58 |
## Evaluation Results
|
| 59 |
|
| 60 |
-
We evaluate
|
| 61 |
|
| 62 |
### Needle in a Haystack
|
| 63 |
|
|
|
|
| 57 |
|
| 58 |
## Evaluation Results
|
| 59 |
|
| 60 |
+
We evaluate Nemotron-UltraLong-8B on a diverse set of benchmarks, including long-context tasks (e.g., RULER, LV-Eval, and InfiniteBench) and standard tasks (e.g., MMLU, MATH, GSM-8K, and HumanEval). UltraLong-8B achieves superior performance on ultra-long context tasks while maintaining competitive results on standard benchmarks.
|
| 61 |
|
| 62 |
### Needle in a Haystack
|
| 63 |
|