metadata
license: mit
library_name: mlx
base_model: deepseek-ai/DeepSeek-V3.1
tags:
- mlx
pipeline_tag: text-generation
----CURRENTLY UPLOADING FILES-----
This notice will be removed once all files have been uploaded.
Notes
See DeepSeek-V3.1 5.5bit MLX in action - demonstration video
q5.5bit quant typically achieves 1.141 perplexity in our testing
| Quantization | Perplexity |
|---|---|
| q2.5 | 41.293 |
| q3.5 | 1.900 |
| q4.5 | 1.168 |
| q5.5 | 1.141 |
| q6.5 | 1.128 |
| q8.5 | 1.128 |
Usage Notes
- Runs on a single M3 Ultra 512GB RAM
- Memory usage: ~480 GB
- Expect ~13-19 tokens/s
- Built with a modified version of MLX 0.26
- For more details see demonstration video or visit DeepSeek-V3.1.