Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

273

Full-text search

Active filters: llama.cpp

jedisct1/MiMo-7B-RL-GGUF

8B • Updated Apr 30 • 324 • 23

Baskar2005/deepseek_Sunfall_Merged_Model

8B • Updated May 8 • 19

Baskar2005/deepseek_sunfall_merged_model_GGUF

8B • Updated May 8 • 4

RDson/Qwen3-30B-A3B-By-Expert-Quantization-GGUF

31B • Updated May 9 • 30 • 1

sychonix/OlympicCoder-7B-Sychonix

8B • Updated May 14 • 4 • 1

kelkalot/medgemma-4b-it-GGUF

4B • Updated May 22 • 8.79k • 5

tifin-india/sarvam-m-24b-q6-k-gguf

Text Generation • 24B • Updated May 24 • 14 • 1

tifin-india/sarvam-m-24b-q5-1-gguf

Text Generation • 24B • Updated May 24 • 10

tifin-india/sarvam-m-24b-q2-k-gguf

Text Generation • 24B • Updated May 24 • 22

tifin-india/sarvam-m-24b-f16-gguf

Text Generation • 24B • Updated May 24 • 10

tifin-india/sarvam-m-24b-q3-k-l-gguf

Text Generation • 24B • Updated May 24 • 15

tifin-india/sarvam-m-24b-q3-k-s-gguf

Text Generation • 24B • Updated May 24 • 10

tifin-india/sarvam-m-24b-q3-k-gguf

Text Generation • 24B • Updated May 24 • 11

tifin-india/sarvam-m-24b-q4-k-m-gguf

Text Generation • 24B • Updated May 24 • 27 • 1

tifin-india/sarvam-m-24b-q3-k-m-gguf

Text Generation • 24B • Updated May 24 • 22

tifin-india/sarvam-m-24b-q4-k-s-gguf

Text Generation • 24B • Updated May 24 • 12

tifin-india/sarvam-m-24b-q5-k-m-gguf

Text Generation • 24B • Updated May 24 • 27 • 2

ykarout/MiMo-VL-7B-SFT-GGUF

Image-Text-to-Text • 8B • Updated Jun 2 • 17

XythicK/Qwen.Qwen2.5-Math-1.5B-GGUF

2B • Updated Jun 5 • 55

Govind222/Koyna-V2-1b-instruct-GGUF

1.0B • Updated Jun 5

agentlans/SmolLM2-135M-Instruct-GGUF

0.1B • Updated Jun 6 • 8

ReallyFloppyPenguin/Holo1-3B-GGUF

3B • Updated Jun 10 • 87 • 2

mgonzs13/SpaceOm-GGUF

Image-Text-to-Text • 3B • Updated Jul 15 • 261 • 1

Darkhn/L3.3-70B-Animus-V1-GGUF

71B • Updated Jun 16 • 93

allura-quants/allura-org_Q3-8B-Kintsugi-GGUF

ReallyFloppyPenguin/sarvam-m-GGUF

24B • Updated Jun 14 • 22 • 1

ReallyFloppyPenguin/DeepSeek-R1-0528-Qwen3-8B-GGUF

8B • Updated Jul 5 • 109

ReallyFloppyPenguin/MiniCPM4-8B-GGUF

8B • Updated Jun 14 • 23

ReallyFloppyPenguin/Nemotron-Research-Reasoning-Qwen-1.5B-GGUF

2B • Updated Jun 14 • 60 • 1

ReallyFloppyPenguin/OpenCodeReasoning-Nemotron-14B-GGUF

15B • Updated Jun 16 • 31 • 1