Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,7 @@ base_model:
|
|
3 |
- deepseek-ai/DeepSeek-V3
|
4 |
---
|
5 |
This is the first 4 layers of DeepSeek-V3 with GPTQ quantization style.
|
6 |
-
- Layer 4's routed experts are quantized to 2-bit
|
7 |
- All other Linear layers are quantized to 4-bit (including MLA, dense layer ffn, and shared expert)
|
8 |
|
9 |
To load and run this model:
|
|
|
3 |
- deepseek-ai/DeepSeek-V3
|
4 |
---
|
5 |
This is the first 4 layers of DeepSeek-V3 with GPTQ quantization style.
|
6 |
+
- Layer 4's all routed experts (256 experts) are quantized to 2-bit
|
7 |
- All other Linear layers are quantized to 4-bit (including MLA, dense layer ffn, and shared expert)
|
8 |
|
9 |
To load and run this model:
|