Finetune GPT-OSS models 🔥
#33
by
hiyouga
- opened
We use LlamaFactory to perform LoRA fine-tuning on the GPT-OSS model. You can reproduce our experiment using the following steps.
1. Install LlamaFactory and latest Transformers
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]" --no-build-isolation
pip install "transformers==4.55.0"
2. Train GPT-OSS on a single GPU (> 44GB) (multi-GPU is also supported)
llamafactory-cli train examples/train_lora/gpt_lora_sft.yaml
3. Merge the LoRA weight into the base model
llamafactory-cli export --model_name_or_path openai/gpt-oss-20b --adapter_name_or_path saves/gpt-20b/lora/sft --export_dir gpt_merged
(Optional) Chat with the fine-tuned model
llamafactory-cli chat --model_name_or_path gpt_merged --template gpt --skip_special_tokens False
You can fine the loss trajectory in the checkpoint folder:
The no-code web UI also supports GPT finetuning: https://huggingface.co/spaces/hiyouga/LLaMA-Board
hiyouga
changed discussion title from
Fine-tuning GPT-OSS
to Finetune GPT-OSS models
hiyouga
changed discussion title from
Finetune GPT-OSS models
to Finetune GPT-OSS models 🔥
I don't think fine tuning can save this model. It's up there with Gemma3 (before abliteration), LFM2 1.2B, and Phi-3.