Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: text-generation
|
|
6 |
|
7 |
## Model Details
|
8 |
|
9 |
-
This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) via RTN(no algorithm tuning)
|
10 |
Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
|
11 |
Please follow the license of the original model.
|
12 |
|
@@ -71,7 +71,9 @@ Some of the most popular adventures include:
|
|
71 |
```
|
72 |
|
73 |
### Generate the model
|
74 |
-
|
|
|
|
|
75 |
```python
|
76 |
import torch
|
77 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
@@ -90,7 +92,7 @@ for n, m in model.named_modules():
|
|
90 |
layer_config[n] = {"bits": 8}
|
91 |
print(n, 8)
|
92 |
|
93 |
-
autoround = AutoRound(model_name
|
94 |
autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
|
95 |
|
96 |
```
|
|
|
6 |
|
7 |
## Model Details
|
8 |
|
9 |
+
This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) **via RTN(no algorithm tuning)**.
|
10 |
Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
|
11 |
Please follow the license of the original model.
|
12 |
|
|
|
71 |
```
|
72 |
|
73 |
### Generate the model
|
74 |
+
|
75 |
+
This pr is required if the model is fp8 and the device supports fp8 https://github.com/intel/auto-round/pull/750
|
76 |
+
|
77 |
```python
|
78 |
import torch
|
79 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
92 |
layer_config[n] = {"bits": 8}
|
93 |
print(n, 8)
|
94 |
|
95 |
+
autoround = AutoRound(model_name, iters=0, layer_config=layer_config)
|
96 |
autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
|
97 |
|
98 |
```
|