Upload complete model
Browse files
README.md
CHANGED
@@ -6,10 +6,6 @@ tags:
|
|
6 |
- mlx
|
7 |
pipeline_tag: text-generation
|
8 |
---
|
9 |
-
## ----CURRENTLY UPLOADING FILES-----
|
10 |
-
This notice will be removed once all files have been uploaded.
|
11 |
-
|
12 |
-
## Notes
|
13 |
**See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtu.be/ufXZI6aqOU8)**
|
14 |
|
15 |
*q5.5bit quant typically achieves 1.141 perplexity in our testing*
|
@@ -24,8 +20,8 @@ This notice will be removed once all files have been uploaded.
|
|
24 |
|
25 |
## Usage Notes
|
26 |
|
27 |
-
* Runs on a single M3 Ultra 512GB RAM
|
28 |
* Memory usage: ~480 GB
|
29 |
* Expect ~13-19 tokens/s
|
30 |
-
*
|
31 |
* For more details see [demonstration video](https://youtu.be/ufXZI6aqOU8) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).
|
|
|
6 |
- mlx
|
7 |
pipeline_tag: text-generation
|
8 |
---
|
|
|
|
|
|
|
|
|
9 |
**See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtu.be/ufXZI6aqOU8)**
|
10 |
|
11 |
*q5.5bit quant typically achieves 1.141 perplexity in our testing*
|
|
|
20 |
|
21 |
## Usage Notes
|
22 |
|
23 |
+
* Runs on a single M3 Ultra 512GB RAM using [Inferencer app](https://inferencer.com)
|
24 |
* Memory usage: ~480 GB
|
25 |
* Expect ~13-19 tokens/s
|
26 |
+
* Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
|
27 |
* For more details see [demonstration video](https://youtu.be/ufXZI6aqOU8) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).
|