Update README.md
Browse files
README.md
CHANGED
|
@@ -27,6 +27,11 @@ We take ChatML as our chat template:
|
|
| 27 |
|
| 28 |
As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mixtral with any framework and algorithm.
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
## License
|
| 31 |
|
| 32 |
The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.
|
|
|
|
| 27 |
|
| 28 |
As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mixtral with any framework and algorithm.
|
| 29 |
|
| 30 |
+
## Limitations
|
| 31 |
+
* TurboSparse, having just undergone training with 150B tokens, may still exhibit performance gaps in certain tasks.
|
| 32 |
+
* The TurboSparse model has only been trained on English-language datasets, hence its capabilities in other languages are still lacking.
|
| 33 |
+
* The model may produce unexpected outputs due to its small size and probabilistic generation paradigm.
|
| 34 |
+
|
| 35 |
## License
|
| 36 |
|
| 37 |
The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.
|