VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Paper • 2511.04662 • Published Nov 6, 2025 • 34
A Survey of Data Agents: Emerging Paradigm or Overstated Hype? Paper • 2510.23587 • Published Oct 27, 2025 • 65
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22, 2025 • 114
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation Paper • 2504.11109 • Published Apr 15, 2025 • 2
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory Paper • 2509.25140 • Published Sep 29, 2025 • 13
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published Oct 3, 2025 • 97
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 501
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29, 2025 • 141
Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant Paper • 2508.20907 • Published Aug 28, 2025 • 1
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework Paper • 2405.11143 • Published May 20, 2024 • 40
QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL Paper • 2510.00967 • Published Oct 1, 2025 • 11
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 76
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs Paper • 2507.09477 • Published Jul 13, 2025 • 86
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 260