See DeepSeek-V3.1 5.5bit MLX in action - demonstration video

q5.5bit quant typically achieves 1.141 perplexity in our testing

Quantization Perplexity
q2.5 41.293
q3.5 1.900
q4.5 1.168
q5.5 1.141
q6.5 1.128
q8.5 1.128

Usage Notes

Downloads last month
677
Safetensors
Model size
671B params
Tensor type
BF16
·
U32
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for inferencerlabs/deepseek-v3.1-MLX-5.5bit

Quantized
(17)
this model