AnIdealRing's picture

1 14

AnIdealRing

SmartDazi

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 17 hours ago

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

upvoted a paper 3 days ago

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

upvoted a paper 11 days ago

RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

View all activity

Organizations

models 0

None public yet

datasets 0

None public yet