Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25, 2025 • 32
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy Paper • 2510.13778 • Published Oct 15, 2025 • 16
Manipulation Collection Manipulation-related datasets and models • 15 items • Updated Sep 29, 2025 • 7
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning Paper • 2507.16814 • Published Jul 22, 2025 • 21
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published Jul 7, 2025 • 47