MachineGaslighter
/

Cydonia-24B-v4.1-Q4_W-GGUF

Text Generation

Model card Files Files and versions

Experimental "iQ4_W" Quantization of Cydonia-24B-4.1

Original model: TheDrummer/Cydonia-24B-v4.1

iQ4_W Quantization Scheme

"iQ4_W" is an unofficial llama.cpp quantization scheme inspired by Q4_K_X.

The GGUF model available in this repo is quantized as followed:

Tensor name	Q4_K_X	iQ4_W
`token_embd`	Q4_K	Q5_K
`ffn_gate`	Q4_K	IQ4_XS
`ffn_up`	Q4_K	IQ4_XS
`ffn_down`	Q5_K	Q5_K
`attn_q`	Q4_K	Q5_K
`attn_k`	Q8_0	Q8_0
`attn_v`	Q8_0	Q8_0
`attn_output`	Q5_K	Q5_K
`output`	Q8_0	Q6_K

Layers 0,1,2,38,39 are Wider:

Tensor name	Q4_K_X	iQ4_W
`(0/1/2/38/39).ffn_gate`	Q4_K	Q5_K
`(0/1/2/38/39).ffn_up`	Q4_K	Q5_K
`(0/1/2/38/39).ffn_down`	Q5_K	Q6_K
`(0/1/2/38/39).attn_q`	Q4_K	Q6_K
`(0/1/2/38/39).attn_output`	Q5_K	Q6_K

KL-divergence from Q8_0

Quant	BPW	Mean KLD	99.9% KLD	99.0% KLD	Median KLD
Q5_K_S	5.53	0.010427	0.507101	0.145578	0.004048
iQ4_W	5.01	0.015892	0.987876	0.255739	0.004836
Q4_K_X	5.01	0.015803	1.001576	0.250834	0.004847
Q4_K_M	4.86	0.025398	1.533436	0.367073	0.007728

Usage

Mistral v7 Tekken

Recommended Settings

Sampler	Range
`temperature`	`0.6-1`
`top_nsigma`	`1.2-1.34`
`smoothing_factor`	`0,2`
`smoothing_curve`	`1`
`dry_multiplier`	`0.2-0.3`
`dry_base`	`1.25-1.5`

༼ つ ◕_◕ ༽つ

Please Test

Downloads last month: 19

GGUF

Model size

23.6B params

Architecture

llama

Hardware compatibility

Log In to view the estimation

View all variants

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MachineGaslighter/Cydonia-24B-v4.1-Q4_W-GGUF

Base model

TheDrummer/Cydonia-24B-v4.1

Quantized

(10)

this model