Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 21 days ago • 76
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 21 days ago • 76
TTCS: Test-Time Curriculum Synthesis for Self-Evolving Paper • 2601.22628 • Published 25 days ago • 35
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing Paper • 2602.03845 • Published 21 days ago • 26
TTCS: Test-Time Curriculum Synthesis for Self-Evolving Paper • 2601.22628 • Published 25 days ago • 35
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published Dec 2, 2025 • 55
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published Nov 19, 2025 • 43
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9, 2025 • 104
Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published Aug 27, 2025 • 84
POSS: Position Specialist Generates Better Draft for Speculative Decoding Paper • 2506.03566 • Published Jun 4, 2025 • 6
POSS: Position Specialist Generates Better Draft for Speculative Decoding Paper • 2506.03566 • Published Jun 4, 2025 • 6
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation Paper • 2504.00043 • Published Mar 30, 2025 • 10
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation Paper • 2504.00043 • Published Mar 30, 2025 • 10
On Grounded Planning for Embodied Tasks with Language Models Paper • 2209.00465 • Published Aug 29, 2022 • 1