TAO71-AI Quants: Qwen3
Collection
9 items
•
Updated
Quant | Size | Description |
---|---|---|
Q2_K | 5.36 GB | Not recommended for most people. Very low quality. |
Q2_K_L | 6.07 GB | Not recommended for most people. Uses Q8_0 for output and embedding, and Q2_K for everything else. Very low quality. |
Q2_K_XL | 7.42 GB | Not recommended for most people. Uses F16 for output and embedding, and Q2_K for everything else. Very low quality. |
Q3_K_S | 6.2 GB | Not recommended for most people. Prefer any bigger Q3_K quantization. Low quality. |
Q3_K_M | 6.82 GB | Not recommended for most people. Low quality. |
Q3_K_L | 7.36 GB | Not recommended for most people. Low quality. |
Q3_K_XL | 7.99 GB | Not recommended for most people. Uses Q8_0 for output and embedding, and Q3_K_L for everything else. Low quality. |
Q3_K_XXL | 9.35 GB | Not recommended for most people. Uses F16 for output and embedding, and Q3_K_L for everything else. Low quality. |
Q4_K_S | 7.98 GB | Recommended. Slightly low quality. |
Q4_K_M | 8.38 GB | Recommended. Decent quality for most use cases. |
Q4_K_L | 8.92 GB | Recommended. Uses Q8_0 for output and embedding, and Q4_K_M for everything else. Decent quality. |
Q4_K_XL | 10.28 GB | Recommended. Uses F16 for output and embedding, and Q4_K_M for everything else. Decent quality. |
Q5_K_S | 9.56 GB | Recommended. High quality. |
Q5_K_M | 9.79 GB | Recommended. High quality. |
Q5_K_L | 10.24 GB | Recommended. Uses Q8_0 for output and embedding, and Q5_K_M for everything else. High quality. |
Q5_K_XL | 11.6 GB | Recommended. Uses F16 for output and embedding, and Q5_K_M for everything else. High quality. |
Q6_K | 11.29 GB | Recommended. Very high quality. |
Q6_K_L | 11.64 GB | Recommended. Uses Q8_0 for output and embedding, and Q6_K for everything else. Very high quality. |
Q6_K_XL | 13.0 GB | Recommended. Uses F16 for output and embedding, and Q6_K for everything else. Very high quality. |
Q8_0 | 14.62 GB | Recommended. Quality almost like F16. |
Q8_K_XL | 15.98 GB | Recommended. Uses F16 for output and embedding, and Q8_0 for everything else. Quality almost like F16. |
F16 | 27.51 GB | Not recommended. Overkill. Prefer Q8_0. |
ORIGINAL (BF16) | 27.51 GB | Not recommended. Overkill. Prefer Q8_0. |
Quantized using TAO71-AI AutoQuantizer. You can check out the original model card here.