|  | --- | 
					
						
						|  | license: mit | 
					
						
						|  | library_name: mlx | 
					
						
						|  | base_model: deepseek-ai/DeepSeek-V3.1 | 
					
						
						|  | tags: | 
					
						
						|  | - mlx | 
					
						
						|  | pipeline_tag: text-generation | 
					
						
						|  | --- | 
					
						
						|  | ## ----CURRENTLY UPLOADING FILES----- | 
					
						
						|  | This notice will be removed once all files have been uploaded. | 
					
						
						|  |  | 
					
						
						|  | ## Notes | 
					
						
						|  | **See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtu.be/ufXZI6aqOU8)** | 
					
						
						|  |  | 
					
						
						|  | *q5.5bit quant typically achieves 1.141 perplexity in our testing* | 
					
						
						|  | | Quantization | Perplexity | | 
					
						
						|  | |:------------:|:----------:| | 
					
						
						|  | | **q2.5**     | 41.293     | | 
					
						
						|  | | **q3.5**     | 1.900      | | 
					
						
						|  | | **q4.5**     | 1.168      | | 
					
						
						|  | | **q5.5**     | 1.141      | | 
					
						
						|  | | **q6.5**     | 1.128      | | 
					
						
						|  | | **q8.5**     | 1.128      | | 
					
						
						|  |  | 
					
						
						|  | ## Usage Notes | 
					
						
						|  |  | 
					
						
						|  | * Runs on a single M3 Ultra 512GB RAM | 
					
						
						|  | * Memory usage: ~480 GB | 
					
						
						|  | * Expect ~13-19 tokens/s | 
					
						
						|  | * Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26 | 
					
						
						|  | * For more details see [demonstration video](https://youtu.be/ufXZI6aqOU8) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1). |