infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_fullfinetuning_ckpt50 2B • Updated 2 days ago • 8
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_fullfinetuning_ckpt50 2B • Updated 2 days ago • 8
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_fullfinetuning_ckpt100 2B • Updated 2 days ago • 10
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_fullfinetuning_ckpt100 2B • Updated 2 days ago • 10
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_rollout_16_fullfinetuning_merged 2B • Updated 2 days ago • 8
infinitylogesh/qwen3_1_7b_base_srt_grpo_math_12k_single_stage_rollout_16_fullfinetuning_merged 2B • Updated 2 days ago • 8
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Single-Stage-Rollout-16-Full-Finetuning 2B • Updated 4 days ago • 8
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Single-Stage-Rollout-16-Full-Finetuning 2B • Updated 4 days ago • 8
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Stage-1 Text Generation • 2B • Updated 10 days ago • 11.6k
infinitylogesh/Qwen3-1.7B-GRPO-SRT-Math-12k-Stage-1 Text Generation • 2B • Updated 10 days ago • 11.6k
infinitylogesh/Qwen2.5-1.5B-GRPO-SRT-Math-12k-Stage-0 Text Generation • 2B • Updated 11 days ago • 22
infinitylogesh/Qwen2.5-1.5B-GRPO-SRT-Math-12k-Stage-0 Text Generation • 2B • Updated 11 days ago • 22