It would be great if we had some quantized version of it like GGUF or even fp8
x2
AWQ would be awesome !!!!!
can be done? or compressed tensors in W4A16_ASYM
· Sign up or log in to comment