Finetune GPT-OSS models 🔥

#33
by hiyouga - opened

We use LlamaFactory to perform LoRA fine-tuning on the GPT-OSS model. You can reproduce our experiment using the following steps.

1. Install LlamaFactory and latest Transformers

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]" --no-build-isolation
pip install "transformers==4.55.0"

2. Train GPT-OSS on a single GPU (> 44GB) (multi-GPU is also supported)

llamafactory-cli train examples/train_lora/gpt_lora_sft.yaml

3. Merge the LoRA weight into the base model

llamafactory-cli export --model_name_or_path openai/gpt-oss-20b --adapter_name_or_path saves/gpt-20b/lora/sft --export_dir gpt_merged

(Optional) Chat with the fine-tuned model

llamafactory-cli chat --model_name_or_path gpt_merged --template gpt --skip_special_tokens False

You can fine the loss trajectory in the checkpoint folder:

The no-code web UI also supports GPT finetuning: https://huggingface.co/spaces/hiyouga/LLaMA-Board
image.png

hiyouga changed discussion title from Fine-tuning GPT-OSS to Finetune GPT-OSS models
hiyouga changed discussion title from Finetune GPT-OSS models to Finetune GPT-OSS models 🔥

I don't think fine tuning can save this model. It's up there with Gemma3 (before abliteration), LFM2 1.2B, and Phi-3.

@hiyouga after the lora merge, what's the format of the weight file, fp4 or bf16?

@shaohuay bf16

I want to deploy using fp4. Does lf support exporting fp4 oss-20b models?

Sign up or log in to comment