wenhuach commited on
Commit
555c52f
·
verified ·
1 Parent(s): 8c1ac85

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -7
README.md CHANGED
@@ -6,7 +6,8 @@ pipeline_tag: text-generation
6
 
7
  ## Model Details
8
 
9
- This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm. Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
 
10
  Please follow the license of the original model.
11
 
12
  ## How To Use
@@ -70,16 +71,15 @@ Some of the most popular adventures include:
70
  ```
71
 
72
  ### Generate the model
 
73
  ```python
74
  import torch
75
  from transformers import AutoModelForCausalLM, AutoTokenizer
76
  import transformers
 
77
 
78
  model_name = "deepseek-ai/DeepSeek-V3.1-Base"
79
 
80
- tokenizer = AutoTokenizer.from_pretrained(model_name)
81
- model = AutoModelForCausalLM.from_pretrained(model_name,device_map="cpu", torch_dtype="auto"
82
-
83
  layer_config = {}
84
  for n, m in model.named_modules():
85
  if isinstance(m, torch.nn.Linear):
@@ -90,9 +90,7 @@ for n, m in model.named_modules():
90
  layer_config[n] = {"bits": 8}
91
  print(n, 8)
92
 
93
- from auto_round import AutoRound
94
-
95
- autoround = AutoRound(model, tokenizer, iters=0, layer_config=layer_config)
96
  autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
97
 
98
  ```
 
6
 
7
  ## Model Details
8
 
9
+ This model is an mixed int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3.1-Base](https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base) generated by [intel/auto-round](https://github.com/intel/auto-round) via RTN(no algorithm tuning).
10
+ Non expert layers are fallback to 8 bits. Please refer to Section Generate the model for more details.
11
  Please follow the license of the original model.
12
 
13
  ## How To Use
 
71
  ```
72
 
73
  ### Generate the model
74
+ this pr is required if the model is fp8 and the device supports fp8 https://github.com/intel/auto-round/pull/750
75
  ```python
76
  import torch
77
  from transformers import AutoModelForCausalLM, AutoTokenizer
78
  import transformers
79
+ from auto_round import AutoRound
80
 
81
  model_name = "deepseek-ai/DeepSeek-V3.1-Base"
82
 
 
 
 
83
  layer_config = {}
84
  for n, m in model.named_modules():
85
  if isinstance(m, torch.nn.Linear):
 
90
  layer_config[n] = {"bits": 8}
91
  print(n, 8)
92
 
93
+ autoround = AutoRound(model_name,, iters=0, layer_config=layer_config)
 
 
94
  autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")
95
 
96
  ```