LLM360
/

guru-7B

Text Generation

text-generation-inference

Model card Files Files and versions

guru-7B / README.md

zhoujun's picture

Update README.md

8868213 verified 3 months ago

|

438 Bytes

metadata

library_name: transformers
pipeline_tag: text-generation
license: cc-by-nc-4.0

This repository contains the Guru-32B (base Qwen2.5-32B) model presented in Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective.

The score we evaluate with temperature=1.0, top_p=0.7.

Please refer to the paper for more details.