Intel
/

DeepSeek-V3.1-Base-int4-mixed-AutoRound

Text Generation

4-bit precision

Model card Files Files and versions

wenhuach commited on 5 days ago

Commit

555c52f

·

verified ·

1 Parent(s): 8c1ac85

Update README.md

Files changed (1) hide show

README.md +5 -7

README.md CHANGED Viewed

@@ -6,7 +6,8 @@ pipeline_tag: text-generation
 ## Model Details
-This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
 Please follow the license of the original model.
 ## How To Use
@@ -70,16 +71,15 @@ Some of the most popular adventures include:
 ```
 ### Generate the model
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import transformers
 model_name = "deepseek-ai/DeepSeek-V3.1-Base"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForCausalLM.from_pretrained(model_name,device_map="cpu", torch_dtype="auto"
 layer_config = {}
 for n, m in model.named_modules():
     if isinstance(m, torch.nn.Linear):
@@ -90,9 +90,7 @@ for n, m in model.named_modules():
             layer_config[n] = {"bits": 8}
             print(n, 8)
-from auto_round import AutoRound
-autoround = AutoRound(model, tokenizer, iters=0, layer_config=layer_config)
 autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
 ```

 ## Model Details
+This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) via RTN(no algorithm tuning).
+Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
 Please follow the license of the original model.
 ## How To Use
 ```
 ### Generate the model
+this pr is required if the model is fp8 and the device supports fp8  https://github.com/intel/auto-round/pull/750
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import transformers
+from auto_round import AutoRound
 model_name = "deepseek-ai/DeepSeek-V3.1-Base"
 layer_config = {}
 for n, m in model.named_modules():
     if isinstance(m, torch.nn.Linear):
             layer_config[n] = {"bits": 8}
             print(n, 8)
+autoround = AutoRound(model_name,, iters=0, layer_config=layer_config)
 autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
 ```