cuijian0819
/

gpt-oss-20b-function-calling-gguf

Text Generation

Model card Files Files and versions Community

cuijian0819 commited on 8 days ago

Commit

d0805f9

·

verified ·

1 Parent(s): b8aef92

Add comprehensive README

Files changed (1) hide show

README.md +101 -0

README.md ADDED Viewed

	@@ -0,0 +1,101 @@

+---
+tags:
+- gguf
+- quantized
+- gpt-oss
+- multilingual
+- text-generation
+- llama-cpp
+- ollama
+language:
+- en
+- es
+- fr
+- de
+- it
+- pt
+license: apache-2.0
+model_type: gpt-oss
+pipeline_tag: text-generation
+base_model: openai/gpt-oss-20b
+---
+# GPT-OSS-20B Function Calling GGUF
+This repository contains the GPT-OSS-20B model fine-tuned on function calling data, converted to GGUF format for efficient inference with llama.cpp and Ollama.
+## Model Details
+- **Base Model:** openai/gpt-oss-20b
+- **Fine-tuning Dataset:** Salesforce/xlam-function-calling-60k (100 samples)
+- **Fine-tuning Method:** LoRA (r=8, alpha=16)
+- **Context Length:** 131,072 tokens
+- **Model Size:** 20B parameters
+## Files
+- `gpt-oss-20b-function-calling-mxfp4.gguf`: MXFP4 precision model (best quality)
+- `gpt-oss-20b-function-calling.Q4_K_M.gguf`: Q4_K_M quantized model (recommended for inference)
+## Usage
+### With Ollama (Recommended)
+```bash
+# Direct from Hugging Face
+ollama run hf.co/cuijian0819/gpt-oss-20b-function-calling-gguf:Q4_K_M
+# Or create local model
+ollama create my-gpt-oss -f Modelfile
+ollama run my-gpt-oss
+```
+### With llama.cpp
+```bash
+# Download model
+wget https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf/resolve/main/gpt-oss-20b-function-calling.Q4_K_M.gguf
+# Run inference
+./llama-cli -m gpt-oss-20b-function-calling.Q4_K_M.gguf -p "Your prompt here"
+```
+### Example Modelfile for Ollama
+```dockerfile
+FROM ./gpt-oss-20b-function-calling.Q4_K_M.gguf
+TEMPLATE """<|start|>user<|message|>{{ .Prompt }}<|end|>
+<|start|>assistant<|channel|>final<|message|>"""
+PARAMETER temperature 0.7
+PARAMETER top_p 0.9
+SYSTEM """You are a helpful AI assistant that can call functions to help users."""
+```
+## PyTorch Version
+For training and fine-tuning with PyTorch/Transformers, check out the PyTorch version: [cuijian0819/gpt-oss-20b-function-calling](https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling)
+## Performance
+The Q4_K_M quantized version provides excellent performance:
+- **Size Reduction:** ~62% smaller than F16
+- **Memory Requirements:** ~16GB VRAM recommended
+- **Quality:** Minimal degradation from quantization
+## License
+This model inherits the license from the base openai/gpt-oss-20b model.
+## Citation
+```bibtex
+@misc{gpt-oss-20b-function-calling-gguf,
+  title={GPT-OSS-20B Function Calling GGUF},
+  author={cuijian0819},
+  year={2025},
+  url={https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf}
+}
+```