Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 10 days ago • 117
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 9 days ago • 54
AutoLLMResearch: Training Research Agents for Automating LLM Experiment Configuration -- Learning from Cheap, Optimizing Expensive Paper • 2605.11518 • Published 10 days ago • 4
MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning Paper • 2605.13037 • Published 9 days ago • 8
Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation Paper • 2605.01284 • Published 20 days ago • 3
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding Paper • 2604.26779 • Published 23 days ago • 13
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 30 days ago • 240
Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment Paper • 2604.05684 • Published Apr 7 • 9
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 342