inferencerlabs
/

DeepSeek-V3.1-MLX-5.5bit

Text Generation

Model card Files Files and versions

inferencerlabs commited on Aug 21

Commit

6c5d703

·

verified ·

1 Parent(s): b376fe6

Upload complete model

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -6,9 +6,12 @@ tags:
 - mlx
 pipeline_tag: text-generation
 ---
 **See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**
-*q5.5bit quant typically achieves 1.141 perplexity in our testing
 | Quantization | Perplexity |
 |:------------:|:----------:|
 | **q2.5**     | 41.293     |
@@ -20,7 +23,8 @@ pipeline_tag: text-generation
 ## Usage Notes
-* Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
 * Memory usage: ~480 GB
 * Expect ~15 tokens/s
 * For more details see [demonstration video](https://youtube.com/xcreate) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).

 - mlx
 pipeline_tag: text-generation
 ---
+## CURRENTLY UPLOADING FILES
+This notice will be removed once all files are uploaded.
 **See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**
+*q5.5bit quant typically achieves 1.141 perplexity in our testing*
 | Quantization | Perplexity |
 |:------------:|:----------:|
 | **q2.5**     | 41.293     |
 ## Usage Notes
+* Runs on a single M3 Ultra 512GB RAM
 * Memory usage: ~480 GB
 * Expect ~15 tokens/s
+* Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
 * For more details see [demonstration video](https://youtube.com/xcreate) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).