inferencerlabs
/

DeepSeek-V3.1-MLX-5.5bit

Text Generation

Model card Files Files and versions

inferencerlabs commited on Aug 22

Commit

7fa1142

·

verified ·

1 Parent(s): 837971a

Upload complete model

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -6,11 +6,11 @@ tags:
 - mlx
 pipeline_tag: text-generation
 ---
-## CURRENTLY UPLOADING FILES
 This notice will be removed once all files have been uploaded.
-...
-**See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**
 *q5.5bit quant typically achieves 1.141 perplexity in our testing*
 | Quantization | Perplexity |
@@ -26,6 +26,6 @@ This notice will be removed once all files have been uploaded.
 * Runs on a single M3 Ultra 512GB RAM
 * Memory usage: ~480 GB
-* Expect ~15 tokens/s
 * Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
-* For more details see [demonstration video](https://youtube.com/xcreate) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).

 - mlx
 pipeline_tag: text-generation
 ---
+## ----CURRENTLY UPLOADING FILES-----
 This notice will be removed once all files have been uploaded.
+## Notes
+**See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](hhttps://youtu.be/ufXZI6aqOU8)**
 *q5.5bit quant typically achieves 1.141 perplexity in our testing*
 | Quantization | Perplexity |
 * Runs on a single M3 Ultra 512GB RAM
 * Memory usage: ~480 GB
+* Expect ~13-19 tokens/s
 * Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
+* For more details see [demonstration video](https://youtu.be/ufXZI6aqOU8) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).