Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
LynnSlate 's Collections
Ai

Ai

updated Jul 22
Upvote
1

  • R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

    Paper • 2503.12937 • Published Mar 17 • 30

  • Reinforcement Pre-Training

    Paper • 2506.08007 • Published Jun 9 • 256

  • Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

    Paper • 2507.07996 • Published Jul 10 • 32

  • Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

    Paper • 2507.13158 • Published Jul 17 • 24

  • MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

    Paper • 2507.14683 • Published Jul 19 • 126
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略