Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 3 days ago • 19
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 2 days ago • 33
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published 1 day ago • 40
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published 1 day ago • 69
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 1 day ago • 62
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published 3 days ago • 87
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently Paper • 2602.02619 • Published 3 days ago • 45
SWE-Universe: Scale Real-World Verifiable Environments to Millions Paper • 2602.02361 • Published 3 days ago • 54
Beyond Pixels: Visual Metaphor Transfer via Schema-Driven Agentic Reasoning Paper • 2602.01335 • Published 4 days ago • 15
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 7 days ago • 144
ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought Paper • 2601.23184 • Published 6 days ago • 32
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System Paper • 2602.02488 • Published 3 days ago • 29
Balancing Understanding and Generation in Discrete Diffusion Models Paper • 2602.01362 • Published 4 days ago • 13
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents Paper • 2602.01566 • Published 4 days ago • 43
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 3 days ago • 122
DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation Paper • 2601.22904 • Published 6 days ago • 13
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published 5 days ago • 38
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss Paper • 2602.02493 • Published 3 days ago • 38