EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published Dec 5, 2025 • 38
Simulating the Visual World with Artificial Intelligence: A Roadmap Paper • 2511.08585 • Published Nov 11, 2025 • 29
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation Paper • 2510.26794 • Published Oct 30, 2025 • 26
From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors Paper • 2510.17439 • Published Oct 20, 2025 • 26
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction Paper • 2510.22706 • Published Oct 26, 2025 • 40
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views Paper • 2510.18632 • Published Oct 21, 2025 • 21
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion Paper • 2507.02813 • Published Jul 3, 2025 • 60
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper • 2505.23747 • Published May 29, 2025 • 68
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Paper • 2505.07747 • Published May 12, 2025 • 61