agurung/Qwen2.5-7B-Instruct-flawedfiction-latent-grpo-nosft Text Generation • 8B • Updated Oct 30 • 5
agurung/v3ff_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset_newprompt Text Generation • 8B • Updated Oct 25 • 21
agurung/v2ff_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Oct 25 • 6
agurung/v1ff_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Oct 25 • 5
agurung/Qwen2.5-7B-Instruct-CONTRASTIVE-NRL-NCP-GRPO-NLL-UNBOUNDED-IMPLICITPROMPT-RPWITHOUTUPDATE Text Generation • 8B • Updated Sep 28 • 5
agurung/Qwen2.5-7B-Instruct-CONTRASTIVE-NRL-NCP-GRPO-NLL-UNBOUNDED-IMPLICITPROMPT-RPWITHSFT Text Generation • 8B • Updated Sep 25 • 12
agurung/Qwen2.5-7B-Instruct-CONTRASTIVE-NRL-NCP-GRPO-NLL-UNBOUNDED-IMPLICITPROMPT Text Generation • 8B • Updated Aug 26 • 6
agurung/tmp_renamed_v4_savebestearly_sft_qwen7b_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Aug 25 • 7
agurung/Qwen2.5-7B-Instruct-CONTRASTIVE-NRL-NCP-GRPO-NLL-UNBOUNDED Text Generation • 8B • Updated Aug 22 • 8
agurung/v4_savebestearly_sft_qwen7B_25percent_lr_1e4_bptt_offset Text Generation • 8B • Updated Aug 19 • 6
agurung/v4_savebestearly_sft_qwen7B_25percent_lr_1e3_bptt_offset Text Generation • 8B • Updated Aug 17 • 7
agurung/Qwen2.5-7B-Instruct-CONTRASTIVE-NRL-NCP-GRPO-PPL-UNBOUNDED Text Generation • 8B • Updated Aug 13 • 12
agurung/v2sft_all_qwen7B_25percent_lr_1e6_allgrad_no_reasoning_projector Text Generation • 8B • Updated Aug 13 • 8