Update README.md
Browse files
README.md
CHANGED
|
@@ -14,6 +14,23 @@ We developed the Depth Up-Scaling technique. Built on the Llama2 architecture, S
|
|
| 14 |
Depth-Upscaled SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table ([link to be updated soon]).
|
| 15 |
Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements. [[link to be updated soon]]
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
# **Usage Instructions**
|
| 19 |
|
|
|
|
| 14 |
Depth-Upscaled SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table ([link to be updated soon]).
|
| 15 |
Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements. [[link to be updated soon]]
|
| 16 |
|
| 17 |
+
# **Evaluation Results**
|
| 18 |
+
| Model | H6 | Model Size |
|
| 19 |
+
|----------------------------------------|-------|------------|
|
| 20 |
+
| **SOLAR-10.7B-Instruct-v1.0** | **74.20** | **~ 11B** |
|
| 21 |
+
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 72.62 | ~ 46.7B |
|
| 22 |
+
| 01-ai/Yi-34B-200K | 70.81 | ~ 34B |
|
| 23 |
+
| 01-ai/Yi-34B | 69.42 | ~ 34B |
|
| 24 |
+
| mistralai/Mixtral-8x7B-v0.1 | 68.42 | ~ 46.7B |
|
| 25 |
+
| meta-llama/Llama-2-70b-hf | 67.87 | ~ 70B |
|
| 26 |
+
| tiiuae/falcon-180B | 67.85 | ~ 180B |
|
| 27 |
+
| **SOLAR-10.7B-v1.0** | **66.04** | **~11B** |
|
| 28 |
+
| mistralai/Mistral-7B-Instruct-v0.2 | 65.71 | ~ 7B |
|
| 29 |
+
| Qwen/Qwen-14B | 65.86 | ~ 14B |
|
| 30 |
+
| 01-ai/Yi-34B-Chat | 65.32 | ~34B |
|
| 31 |
+
| meta-llama/Llama-2-70b-chat-hf | 62.4 | ~ 70B |
|
| 32 |
+
| mistralai/Mistral-7B-v0.1 | 60.97 | ~ 7B |
|
| 33 |
+
| mistralai/Mistral-7B-Instruct-v0.1 | 54.96 | ~ 7B |
|
| 34 |
|
| 35 |
# **Usage Instructions**
|
| 36 |
|