Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | 
         @@ -16,7 +16,7 @@ tags: 
     | 
|
| 16 | 
         
             
            ---
         
     | 
| 17 | 
         | 
| 18 | 
         
             
            # Qwen2.5-Coder-32B-Instruct EXL2 - 4.125bpw L
         
     | 
| 19 | 
         
            -
            This is a 4.125 EXL2 quant of [ 
     | 
| 20 | 
         | 
| 21 | 
         
             
            This quant was made using a [customized version](https://github.com/dinerburger/exllamav2/tree/max-quant-first-last) of exllamav2-0.2.7 (patch graciously provided by [DeusImperator](https://huggingface.co/DeusImperator)) with default dataset and extended quantization sample length (8k instead of default 2k). It also uses -head_bits=8 and max accuracy quant for first and last layer (8bpw), all other layers of the model use normally chosen methods (method and name (4.125bpw_L) inspired by GGUF naming scheme).
         
     | 
| 22 | 
         | 
| 
         | 
|
| 16 | 
         
             
            ---
         
     | 
| 17 | 
         | 
| 18 | 
         
             
            # Qwen2.5-Coder-32B-Instruct EXL2 - 4.125bpw L
         
     | 
| 19 | 
         
            +
            This is a 4.125 EXL2 quant of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)
         
     | 
| 20 | 
         | 
| 21 | 
         
             
            This quant was made using a [customized version](https://github.com/dinerburger/exllamav2/tree/max-quant-first-last) of exllamav2-0.2.7 (patch graciously provided by [DeusImperator](https://huggingface.co/DeusImperator)) with default dataset and extended quantization sample length (8k instead of default 2k). It also uses -head_bits=8 and max accuracy quant for first and last layer (8bpw), all other layers of the model use normally chosen methods (method and name (4.125bpw_L) inspired by GGUF naming scheme).
         
     | 
| 22 | 
         |