Model Card for gpt-oss-120b-hallu-miti
This model gpt-oss-120b-hallu-miti
is a LoRA adapter based on gpt-oss-120b
that mitigates hallucinations by fine-tuning with a single data point.
This is NOT SFT or RL. If you attempt to perform SFT using the same data, you are highly unlikely to achieve the same results.
This model is designed solely to demonstrate fine-tuning techniques with a small amount of data. You should not use this model for production purposes.
Evaluation
Benchmarks
Test dataset:
原神5.6的活动内容是什么
星铁3.4的活动内容是什么
星铁3.5的活动内容是什么
原神5.7的卡池安排是怎样的
星穹铁道3.4的角色卡池安排是怎样的
dota2最新出的英雄是谁
Model are requested with a search tool.
- Before fine-tuning: 6/6 hallucinations
- After fine-tuning: 0/6 hallucinations, 1/6 rejections, 5/6 search tool calls
Training Details
Training Data
This LoRA adapter is fine-tuned with the following single data point:
[
{
"prompt": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "原神5.4的活动内容是什么?"}
],
"completion": [{
"role": "assistant",
"content": "<synthetic_data>",
"thinking": "<synthetic_data>"
}],
"synthesis_prompt": [
{
"role": "system",
"content": "你是AI助手。\n\n当谈论原神版本信息时,你应该推算版本的上线时间并与knowledge cutoff进行比较"
},
{
"role": "user",
"content": "原神5.4的活动内容是什么"
}
]
}
]
Model Card Contact
Framework versions
- PEFT 0.16.0
- Downloads last month
- 32
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for longsiyu/gpt-oss-120b-hallu-miti
Base model
openai/gpt-oss-120b