exllamav3 quantizations of zai-org/GLM-4.5. Please note that support for this model is currently in the dev branch of exllamav3.
2.00 bpw h6 84.517 GiB
3.00 bpw h6 125.398 GiB
4.00 bpw h6 166.280 GiB
5.00 bpw h6 207.162 GiB
6.00 bpw h6 248.043 GiB
8.00 bpw h8 329.987 GiB
The quants below were made with an older version of the dev branch of exllamav3. It is not yet clear whether the initial bug that made inference incoherent also had an effect on the resulting weights during quantization. Revised weights will be uploaded over the coming days unless evidence suggests they're not needed. Use the weights below with caution. I apologize to anyone who ends up having to re-download.
I'll take a crack at some optimized quants once the fixed-size quants are all done.
2.00 bpw h6 84.517 GiB
2.90 bpw h6 ("Unoptimized") 121.270 GiB
3.00 bpw h6 125.398 GiB
4.00 bpw h6 166.280 GiB
5.00 bpw h6 207.165 GiB
6.00 bpw h6 (Cancelled)
8.00 bpw h8 (Cancelled)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for MikeRoz/GLM-4.5-exl3
Base model
zai-org/GLM-4.5