Pre-Trained Policy Discriminators are General Reward Models Paper • 2507.05197 • Published Jul 7 • 39
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments Paper • 2406.04151 • Published Jun 6, 2024 • 23
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper • 2307.16789 • Published Jul 31, 2023 • 101