Any quants between Q8_K_XL and BF16?

#16
by rtzurtz - opened

Hi unsloth, thanks for offering the Q8_K_XL quant.
I recently noticed a difference in knowledge of Qwen3-30B-A3B-Q6_K.gguf vs Qwen3-30B-A3B-Q8_0.gguf, where only Q8_0 answered correctly and Q6_K was way off. Other fields may be affected too.
When a BF16 can't fit, but Q8_K_XL can easily fit, is it possible to have a few more quants in between?
If (this) 3B active parameters model degrade faster with lower quants, the question of higher quants might be more relevant than for models with more parameters.

rtzurtz changed discussion title from Any quants between Q8_K_XL and FB16? to Any quants between Q8_K_XL and BF16?

Sign up or log in to comment