The tasks and counterfactuals from the Mechanistic Interpretability Benchmark.
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
Recent Activity
View all activity
datasets 7
mib-bench/ravel
Viewer
• Updated
• 117k • 17
mib-bench/arithmetic_subtraction
Viewer
• Updated
• 20.9k • 31
mib-bench/arithmetic_addition
Viewer
• Updated
• 40.4k • 85
mib-bench/ioi
Viewer
• Updated
• 21k • 414
mib-bench/arc_easy
Viewer
• Updated
• 4.01k • 162
mib-bench/arc_challenge
Viewer
• Updated
• 2k • 58
mib-bench/copycolors_mcqa
Viewer
• Updated
• 1.89k • 83