PrimeIntellect
/

Qwen3-0.6B-Reverse-Text-SFT

Text Generation

text-generation-inference

Model card Files Files and versions

Qwen3-0.6B-Reverse-Text-SFT / README.md

mikasenghaas's picture

Update README.md

641cf83 verified about 1 month ago

|

history blame contribute delete

902 Bytes

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- PrimeIntellect/Reverse-Text-SFT
	base_model:
	- PrimeIntellect/Qwen3-0.6B
	---

	# Qwen3-0.6B-Reverse-Text-SFT

	<!-- Provide a quick summary of what the model is/does. -->

	A debug model fine-tuned on `willcb/R1-reverse-wikipedia-paragraphs-v1-1000`. To be used as warmed up model to RL in `vf-reverse-text`.

	Created with this training command from [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) (commit hash: `8262560`)

	```bash
	uv run torchrun --nproc-per-node 8 src/prime_rl/trainer/sft/train.py \
	--model.name PrimeIntellect/Qwen3-0.6B \
	--data.name willcb/R1-reverse-wikipedia-paragraphs-v1-1000 \
	--max-steps 100 \
	--data.batch-size 16 \
	--data.micro-batch-size 1 \
	--data.seq-len 4096 \
	--optim.lr 2e-5
	```

	Check the run out on [W&B](https://wandb.ai/primeintellect/mika/runs/odsfiekx?nw=nwusermikasenghaas_).