inferencerlabs commited on
Commit
f2ee58a
·
verified ·
1 Parent(s): cd17387

Upload complete model

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: mlx
4
+ base_model: deepseek-ai/DeepSeek-V3.1
5
+ tags:
6
+ - mlx
7
+ pipeline_tag: text-generation
8
+ ---
9
+ **See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**
10
+
11
+ *q5.5bit quant typically achieves 1.141 perplexity in our testing
12
+ | Quantization | Perplexity |
13
+ |:------------:|:----------:|
14
+ | **q2.5** | 41.293 |
15
+ | **q3.5** | 1.900 |
16
+ | **q4.5** | 1.168 |
17
+ | **q5.5** | 1.141 |
18
+ | **q6.5** | 1.128 |
19
+ | **q8.5** | 1.128 |
20
+
21
+ ## Usage Notes
22
+
23
+ * Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
24
+ * Memory usage: ~480 GB
25
+ * Expect ~15 tokens/s
26
+ * For more details see [demonstration video](https://youtube.com/xcreate) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).