Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ metrics:
|
|
19 |
- pearsonr
|
20 |
- spearmanr
|
21 |
- kendall-tau
|
22 |
-
model_name: Flow-Judge-v0.1-
|
23 |
base_model: microsoft/Phi-3.5-mini-instruct
|
24 |
inference: false
|
25 |
model_creator: Flow AI
|
@@ -41,6 +41,17 @@ quantized_by: Flow AI
|
|
41 |
This repo contains AWQ safetensors quant for [Flow-Judge-v0.1](https://huggingface.co/flowaicom/Flow-Judge-v0.1).
|
42 |
|
43 |
## Quantization config
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
TBD
|
46 |
|
|
|
19 |
- pearsonr
|
20 |
- spearmanr
|
21 |
- kendall-tau
|
22 |
+
model_name: Flow-Judge-v0.1-AWQ
|
23 |
base_model: microsoft/Phi-3.5-mini-instruct
|
24 |
inference: false
|
25 |
model_creator: Flow AI
|
|
|
41 |
This repo contains AWQ safetensors quant for [Flow-Judge-v0.1](https://huggingface.co/flowaicom/Flow-Judge-v0.1).
|
42 |
|
43 |
## Quantization config
|
44 |
+
```python
|
45 |
+
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
|
46 |
+
model = AutoAWQForCausalLM.from_pretrained(merged_path, **{"low_cpu_mem_usage": True, "use_cache": False},
|
47 |
+
attn_implementation="flash_attention_2", torch_dtype="auto", device_map="auto")
|
48 |
+
tokenizer = AutoTokenizer.from_pretrained(lora_path, trust_remote_code=False)
|
49 |
+
model.quantize(tokenizer, quant_config=quant_config)
|
50 |
+
model.save_quantized(quant_path)
|
51 |
+
tokenizer.save_pretrained(quant_path)
|
52 |
+
```
|
53 |
+
|
54 |
+
## Running the AWQ file
|
55 |
|
56 |
TBD
|
57 |
|