shreyasmeher commited on
Commit
629dfa4
·
verified ·
1 Parent(s): 5e04abf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -33,7 +33,7 @@ inference:
33
  - **Model Type:** GGUF quantized (q4_k_m and q8_0)
34
  - **Base Model:** unsloth/llama-3-8b-bnb-4bit
35
  - **Quantization Details:**
36
- - Methods: q4_k_m and q8_0
37
  - q4_k_m uses Q6_K for half of attention.wv and feed_forward.w2 tensors
38
  - Optimized for both speed (q8_0) and quality (q4_k_m)
39
 
 
33
  - **Model Type:** GGUF quantized (q4_k_m and q8_0)
34
  - **Base Model:** unsloth/llama-3-8b-bnb-4bit
35
  - **Quantization Details:**
36
+ - Methods: q4_k_m, q8_0, BF16
37
  - q4_k_m uses Q6_K for half of attention.wv and feed_forward.w2 tensors
38
  - Optimized for both speed (q8_0) and quality (q4_k_m)
39