kurakurai
/

Luth-1.7B-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions

MaxLSB commited on Aug 9

Commit

4116090

·

verified ·

1 Parent(s): 26c01be

Update README.md

Files changed (1) hide show

README.md +24 -0

README.md CHANGED Viewed

@@ -29,6 +29,8 @@ Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](h
 We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
 **French Evaluation:**
 ![French Evaluation](media/french_evaluation.png)
@@ -37,6 +39,28 @@ We used LightEval for evaluation, with custom tasks for the French benchmarks. T
 ![English Evaluation](media/english_evaluation.png)
 ## Citation
 ```bibtex

 We used LightEval for evaluation, with custom tasks for the French benchmarks. The models were evaluated with a `temperature=0`.
+### Evaluation Visualizations
 **French Evaluation:**
 ![French Evaluation](media/french_evaluation.png)
 ![English Evaluation](media/english_evaluation.png)
+### French Benchmark Scores
+| Benchmark         | Qwen3-1.7B       | SmolLM2-1.7B-Instruct | Qwen2.5-1.5B-Instruct | Luth-1.7B-Instruct   |
+|-------------------|------------------|-----------------------|-----------------------|----------------------|
+| ifeval-fr         | 54.53            | 31.24                 | 32.90                 | <u>57.67</u>         |
+| gpqa-diamond-fr   | 26.90            | 21.83                 | 28.93                 | <u>38.58</u>         |
+| mmlu-fr           | 28.46            | 33.73                 | 46.25                 | <u>49.66</u>         |
+| math-500-fr       | 60.80            | 11.20                 | 32.20                 | <u>64.00</u>         |
+| arc-chall-fr      | 33.28            | 28.57                 | 32.68                 | <u>35.16</u>         |
+| hellaswag-fr      | 24.86            | <u>49.58</u>          | 34.34                 | 31.93                |
+### English Benchmark Scores
+| Benchmark         | Qwen3-1.7B       | SmolLM2-1.7B-Instruct | Qwen2.5-1.5B-Instruct | Luth-1.7B-Instruct   |
+|-------------------|------------------|-----------------------|-----------------------|----------------------|
+| ifeval-en         | <u>68.39</u>     | 48.24                 | 39.93                 | 65.80                |
+| gpqa-diamond-en   | <u>31.82</u>     | 24.75                 | 30.30                 | 31.82                |
+| mmlu-en           | 52.74            | 50.27                 | 59.81                 | <u>60.19</u>         |
+| math-500-en       | 69.20            | 22.40                 | 56.00                 | <u>70.00</u>         |
+| arc-chall-en      | 36.09            | 42.32                 | 41.04                 | <u>42.24</u>         |
+| hellaswag-en      | 46.96            | <u>66.94</u>          | 64.48                 | 58.55                |
 ## Citation
 ```bibtex