cuijian0819 commited on
Commit
d0805f9
·
verified ·
1 Parent(s): b8aef92

Add comprehensive README

Browse files
Files changed (1) hide show
  1. README.md +101 -0
README.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - gguf
4
+ - quantized
5
+ - gpt-oss
6
+ - multilingual
7
+ - text-generation
8
+ - llama-cpp
9
+ - ollama
10
+ language:
11
+ - en
12
+ - es
13
+ - fr
14
+ - de
15
+ - it
16
+ - pt
17
+ license: apache-2.0
18
+ model_type: gpt-oss
19
+ pipeline_tag: text-generation
20
+ base_model: openai/gpt-oss-20b
21
+ ---
22
+
23
+ # GPT-OSS-20B Function Calling GGUF
24
+
25
+ This repository contains the GPT-OSS-20B model fine-tuned on function calling data, converted to GGUF format for efficient inference with llama.cpp and Ollama.
26
+
27
+ ## Model Details
28
+
29
+ - **Base Model:** openai/gpt-oss-20b
30
+ - **Fine-tuning Dataset:** Salesforce/xlam-function-calling-60k (100 samples)
31
+ - **Fine-tuning Method:** LoRA (r=8, alpha=16)
32
+ - **Context Length:** 131,072 tokens
33
+ - **Model Size:** 20B parameters
34
+
35
+ ## Files
36
+
37
+ - `gpt-oss-20b-function-calling-mxfp4.gguf`: MXFP4 precision model (best quality)
38
+ - `gpt-oss-20b-function-calling.Q4_K_M.gguf`: Q4_K_M quantized model (recommended for inference)
39
+
40
+ ## Usage
41
+
42
+ ### With Ollama (Recommended)
43
+
44
+ ```bash
45
+ # Direct from Hugging Face
46
+ ollama run hf.co/cuijian0819/gpt-oss-20b-function-calling-gguf:Q4_K_M
47
+
48
+ # Or create local model
49
+ ollama create my-gpt-oss -f Modelfile
50
+ ollama run my-gpt-oss
51
+ ```
52
+
53
+ ### With llama.cpp
54
+
55
+ ```bash
56
+ # Download model
57
+ wget https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf/resolve/main/gpt-oss-20b-function-calling.Q4_K_M.gguf
58
+
59
+ # Run inference
60
+ ./llama-cli -m gpt-oss-20b-function-calling.Q4_K_M.gguf -p "Your prompt here"
61
+ ```
62
+
63
+ ### Example Modelfile for Ollama
64
+
65
+ ```dockerfile
66
+ FROM ./gpt-oss-20b-function-calling.Q4_K_M.gguf
67
+
68
+ TEMPLATE """<|start|>user<|message|>{{ .Prompt }}<|end|>
69
+ <|start|>assistant<|channel|>final<|message|>"""
70
+
71
+ PARAMETER temperature 0.7
72
+ PARAMETER top_p 0.9
73
+
74
+ SYSTEM """You are a helpful AI assistant that can call functions to help users."""
75
+ ```
76
+
77
+ ## PyTorch Version
78
+
79
+ For training and fine-tuning with PyTorch/Transformers, check out the PyTorch version: [cuijian0819/gpt-oss-20b-function-calling](https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling)
80
+
81
+ ## Performance
82
+
83
+ The Q4_K_M quantized version provides excellent performance:
84
+ - **Size Reduction:** ~62% smaller than F16
85
+ - **Memory Requirements:** ~16GB VRAM recommended
86
+ - **Quality:** Minimal degradation from quantization
87
+
88
+ ## License
89
+
90
+ This model inherits the license from the base openai/gpt-oss-20b model.
91
+
92
+ ## Citation
93
+
94
+ ```bibtex
95
+ @misc{gpt-oss-20b-function-calling-gguf,
96
+ title={GPT-OSS-20B Function Calling GGUF},
97
+ author={cuijian0819},
98
+ year={2025},
99
+ url={https://huggingface.co/cuijian0819/gpt-oss-20b-function-calling-gguf}
100
+ }
101
+ ```