Intel
/

DeepSeek-V3.1-Base-int4-mixed-AutoRound

Text Generation

4-bit precision

Model card Files Files and versions

wenhuach commited on 5 days ago

Commit

62d5300

·

verified ·

1 Parent(s): 555c52f

Update README.md

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ pipeline_tag: text-generation
 ## Model Details
-This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) via RTN(no algorithm tuning).
 Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
 Please follow the license of the original model.
@@ -71,7 +71,9 @@ Some of the most popular adventures include:
 ```
 ### Generate the model
-this pr is required if the model is fp8 and the device supports fp8  https://github.com/intel/auto-round/pull/750
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -90,7 +92,7 @@ for n, m in model.named_modules():
             layer_config[n] = {"bits": 8}
             print(n, 8)
-autoround = AutoRound(model_name,, iters=0, layer_config=layer_config)
 autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
 ```

 ## Model Details
+This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) **via RTN(no algorithm tuning)**.
 Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
 Please follow the license of the original model.
 ```
 ### Generate the model
+This pr is required if the model is fp8 and the device supports fp8  https://github.com/intel/auto-round/pull/750
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
             layer_config[n] = {"bits": 8}
             print(n, 8)
+autoround = AutoRound(model_name, iters=0, layer_config=layer_config)
 autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
 ```