Don't use this - use the newer version instead

Model Card for Qwen3-14B-ZeroGPT-beta-step-150

Model Details

This model was fine tuned with GRPO using an inverted score from trentmkelly/zerogpt_distil as the reward function. This model is still very much undercooked, and I have more experimentation to do with the reward functions, however in its current state it tends to generate essays which consistently score around 20% AI on ZeroGPT's AI text classifier.

Due to suboptimal reward functions defined in the training, the writing style is a little bit strange. If I had to describe it, I'd say it writes like a bright high school student who has a very formulaic understanding of how an essay ought to be formatted.

Future updates will hopefully improve accuracy. Follow me to get notified when I post them :)

System Prompt

The system prompt used during training was /no_think\nYou are an essay writer. Write like a human. You will be graded on how human you sound, so try to avoid sounding like AI. Your essay should be 5 paragraphs long.

Thinking mode hasn't been tested nor have other variations from this prompt. Variations will probably affect how the model performs versus the real classifier.

Framework versions

  • PEFT 0.15.2
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for trentmkelly/Qwen3-14B-ZeroGPT-beta-step-150

Finetuned
Qwen/Qwen3-14B
Adapter
(43)
this model

Collection including trentmkelly/Qwen3-14B-ZeroGPT-beta-step-150