sariola commited on
Commit
b6da270
·
verified ·
1 Parent(s): 08b4e17

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -19,7 +19,7 @@ metrics:
19
  - pearsonr
20
  - spearmanr
21
  - kendall-tau
22
- model_name: Flow-Judge-v0.1-GGUF
23
  base_model: microsoft/Phi-3.5-mini-instruct
24
  inference: false
25
  model_creator: Flow AI
@@ -41,6 +41,17 @@ quantized_by: Flow AI
41
  This repo contains AWQ safetensors quant for [Flow-Judge-v0.1](https://huggingface.co/flowaicom/Flow-Judge-v0.1).
42
 
43
  ## Quantization config
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  TBD
46
 
 
19
  - pearsonr
20
  - spearmanr
21
  - kendall-tau
22
+ model_name: Flow-Judge-v0.1-AWQ
23
  base_model: microsoft/Phi-3.5-mini-instruct
24
  inference: false
25
  model_creator: Flow AI
 
41
  This repo contains AWQ safetensors quant for [Flow-Judge-v0.1](https://huggingface.co/flowaicom/Flow-Judge-v0.1).
42
 
43
  ## Quantization config
44
+ ```python
45
+ quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
46
+ model = AutoAWQForCausalLM.from_pretrained(merged_path, **{"low_cpu_mem_usage": True, "use_cache": False},
47
+ attn_implementation="flash_attention_2", torch_dtype="auto", device_map="auto")
48
+ tokenizer = AutoTokenizer.from_pretrained(lora_path, trust_remote_code=False)
49
+ model.quantize(tokenizer, quant_config=quant_config)
50
+ model.save_quantized(quant_path)
51
+ tokenizer.save_pretrained(quant_path)
52
+ ```
53
+
54
+ ## Running the AWQ file
55
 
56
  TBD
57