Qwen3-0.6B-Reverse-Text-SFT

A debug model fine-tuned on willcb/R1-reverse-wikipedia-paragraphs-v1-1000. To be used as warmed up model to RL in vf-reverse-text.

Created with this training command from prime-rl (commit hash: 8262560)

uv run torchrun --nproc-per-node 8 src/prime_rl/trainer/sft/train.py \
  --model.name PrimeIntellect/Qwen3-0.6B \
  --data.name willcb/R1-reverse-wikipedia-paragraphs-v1-1000 \
  --max-steps 100 \
  --data.batch-size 16 \
  --data.micro-batch-size 1 \
  --data.seq-len 4096 \
  --optim.lr 2e-5

Check the run out on W&B.

Downloads last month
2,702
Safetensors
Model size
596M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for PrimeIntellect/Qwen3-0.6B-Reverse-Text-SFT

Base model

willcb/Qwen3-0.6B
Finetuned
(1)
this model

Dataset used to train PrimeIntellect/Qwen3-0.6B-Reverse-Text-SFT

Collection including PrimeIntellect/Qwen3-0.6B-Reverse-Text-SFT