Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ metrics:
   - pearsonr
   - spearmanr
   - kendall-tau
-model_name: Flow-Judge-v0.1-GGUF
 base_model: microsoft/Phi-3.5-mini-instruct
 inference: false
 model_creator: Flow AI
@@ -41,6 +41,17 @@ quantized_by: Flow AI
 This repo contains AWQ safetensors quant for [Flow-Judge-v0.1](https://huggingface.co/flowaicom/Flow-Judge-v0.1).
 ## Quantization config
 TBD

   - pearsonr
   - spearmanr
   - kendall-tau
+model_name: Flow-Judge-v0.1-AWQ
 base_model: microsoft/Phi-3.5-mini-instruct
 inference: false
 model_creator: Flow AI
 This repo contains AWQ safetensors quant for [Flow-Judge-v0.1](https://huggingface.co/flowaicom/Flow-Judge-v0.1).
 ## Quantization config
+```python
+quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
+model = AutoAWQForCausalLM.from_pretrained(merged_path, **{"low_cpu_mem_usage": True, "use_cache": False},
+attn_implementation="flash_attention_2", torch_dtype="auto", device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained(lora_path, trust_remote_code=False)
+model.quantize(tokenizer, quant_config=quant_config)
+model.save_quantized(quant_path)
+tokenizer.save_pretrained(quant_path)
+```
+## Running the AWQ file
 TBD