ik_llama.cpp imatrix Quantizations of deepseek-ai/DeepSeek-V3.1

This quant REQUIRES ik_llama.cpp fork to support the ik's latest SOTA quants and optimizations! Do not download these big files and expect them to run on mainline vanilla llama.cpp, ollama, LM Studio, KoboldCpp, etc!

NOTE ik_llama.cpp can also run your existing GGUFs from bartowski, unsloth, mradermacher, etc if you want to try it out before downloading my quants.

I made this for myself and my RAM+VRAM setup. For more ik_llama quants of this model, discussions, perplexity measurements, see @ubergarm's DeepSeek-V3.1 Collection

👈 Quant details
#!/usr/bin/env bash

custom="
# First 3 dense layers (0-3) (GPU)
# Using q8_0 for attn_k_b since imatrix might not have these tensors
blk\.[0-2]\.attn_k_b.*=q8_0
blk\.[0-2]\.attn_.*=iq5_ks
blk\.[0-2]\.ffn_down.*=iq5_ks
blk\.[0-2]\.ffn_(gate|up).*=iq4_ks
blk\.[0-2]\..*=iq5_ks

# All attention, norm weights, and bias tensors for MoE layers (3-60) (GPU)
# Using q8_0 for attn_k_b since imatrix might not have these tensors
blk\.[3-9]\.attn_k_b.*=q8_0
blk\.[1-5][0-9]\.attn_k_b.*=q8_0
blk\.60\.attn_k_b.*=q8_0

blk\.[3-9]\.attn_.*=iq5_ks
blk\.[1-5][0-9]\.attn_.*=iq5_ks
blk\.60\.attn_.*=iq5_ks

# Shared Expert (3-60) (GPU)
blk\.[3-9]\.ffn_down_shexp\.weight=iq5_ks
blk\.[1-5][0-9]\.ffn_down_shexp\.weight=iq5_ks
blk\.60\.ffn_down_shexp\.weight=iq5_ks

blk\.[3-9]\.ffn_(gate|up)_shexp\.weight=iq4_ks
blk\.[1-5][0-9]\.ffn_(gate|up)_shexp\.weight=iq4_ks
blk\.60\.ffn_(gate|up)_shexp\.weight=iq4_ks

# Routed Experts (3-60) (CPU)
blk\.[3-9]\.ffn_down_exps\.weight=iq3_ks
blk\.[1-5][0-9]\.ffn_down_exps\.weight=iq3_ks
blk\.60\.ffn_down_exps\.weight=iq3_ks

blk\.[3-9]\.ffn_(gate|up)_exps\.weight=iq2_ks
blk\.[1-5][0-9]\.ffn_(gate|up)_exps\.weight=iq2_ks
blk\.60\.ffn_(gate|up)_exps\.weight=iq2_ks

# Token embedding and output tensors (GPU)
token_embd\.weight=iq5_k
output\.weight=q8_0  # Changed to q8_0
"

custom=$(
  echo "$custom" | grep -v '^#' | \
  sed -Ez 's:\n+:,:g;s:,$::;s:^,::'
)

./build/bin/llama-quantize \
    --custom-q "$custom" \
    --imatrix /fast/DeepSeek-V3.1.imatrix \
    /fast/bf16/DeepSeek-V3-00001-of-00030.gguf
    /fast2/quants/DeepSeek-V3.1-IQ2_KS.gguf \
    IQ2_KS \
Downloads last month
70
GGUF
Model size
672B params
Architecture
deepseek2
Hardware compatibility
Log In to view the estimation

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gghfez/DeepSeek-V3.1-IQ2_KS

Quantized
(17)
this model