ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper β’ 2604.08523 β’ Published 4 days ago β’ 241
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper β’ 2604.07430 β’ Published 5 days ago β’ 152
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning Paper β’ 2505.22019 β’ Published May 28, 2025 β’ 12
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders Paper β’ 2604.07340 β’ Published 5 days ago β’ 14
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 5 days ago β’ 40
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper β’ 2604.04707 β’ Published 7 days ago β’ 200
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures β’ 5 items β’ Updated 5 days ago β’ 13
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 11 days ago β’ 822
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper β’ 2603.25040 β’ Published 18 days ago β’ 126