Do reasoning LLMs actually reason — or learn to game the test? IPT allows for detecting reward hacking in inductive programming tasks (SLR-Bench).
-
Isomorphic Perturbation Testing
🔍1Evaluate rule hypotheses for genuine reasoning vs shortcuts
-
AIML-TUDA/SLR-Bench
Viewer • Updated • 38.5k • 1.42k • 4 -
SLR-Bench Leaderboard - Reward Hacking in Reasoning Models
🎯1Reward shortcut behavior in LLMs via IPT
-
LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking
Paper • 2604.15149 • Published • 1