Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
alibidaran
/
GRPO_Python_Reasoning_Demo
like
0
Transformers
Safetensors
English
text-generation-inference
unsloth
llama
trl
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
6eae990
GRPO_Python_Reasoning_Demo
Commit History
Upload model trained with Unsloth
6eae990
verified
alibidaran
commited on
Jul 17
Upload model trained with Unsloth
f5871ce
verified
alibidaran
commited on
Jul 11
Upload model trained with Unsloth
f8cb221
verified
alibidaran
commited on
Jul 11
Upload README.md with huggingface_hub
c13a282
verified
alibidaran
commited on
Jul 11
initial commit
63e6bef
verified
alibidaran
commited on
Jul 11