-
Captain Safari: A World Engine
Paper • 2511.22815 • Published • 12 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77 -
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper • 2512.14614 • Published • 73 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
Collections
Discover the best community collections!
Collections including paper arxiv:2601.00393
-
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper • 2601.02427 • Published • 46 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 325 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 52 -
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper • 2601.02151 • Published • 113
-
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
Paper • 2512.21004 • Published • 13 -
Spatia: Video Generation with Updatable Spatial Memory
Paper • 2512.15716 • Published • 34 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133 -
Geometry-Aware Rotary Position Embedding for Consistent Video World Model
Paper • 2602.07854 • Published • 10
-
Captain Safari: A World Engine
Paper • 2511.22815 • Published • 12 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77 -
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper • 2512.14614 • Published • 73 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 550 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 325 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 178
-
PersonaLive! Expressive Portrait Image Animation for Live Streaming
Paper • 2512.11253 • Published • 41 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133 -
Agent READMEs: An Empirical Study of Context Files for Agentic Coding
Paper • 2511.12884 • Published • 28 -
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Paper • 2503.11576 • Published • 157
-
MagicWorld: Interactive Geometry-driven Video World Exploration
Paper • 2511.18886 • Published • 19 -
EvoVLA: Self-Evolving Vision-Language-Action Model
Paper • 2511.16166 • Published • 6 -
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper • 2511.17889 • Published • 5 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
-
Captain Safari: A World Engine
Paper • 2511.22815 • Published • 12 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77 -
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper • 2512.14614 • Published • 73 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
-
NitroGen: An Open Foundation Model for Generalist Gaming Agents
Paper • 2601.02427 • Published • 46 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 325 -
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models
Paper • 2512.24165 • Published • 52 -
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Paper • 2601.02151 • Published • 113
-
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 550 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 325 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133 -
LTX-2: Efficient Joint Audio-Visual Foundation Model
Paper • 2601.03233 • Published • 178
-
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
Paper • 2512.21004 • Published • 13 -
Spatia: Video Generation with Updatable Spatial Memory
Paper • 2512.15716 • Published • 34 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133 -
Geometry-Aware Rotary Position Embedding for Consistent Video World Model
Paper • 2602.07854 • Published • 10
-
PersonaLive! Expressive Portrait Image Animation for Live Streaming
Paper • 2512.11253 • Published • 41 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133 -
Agent READMEs: An Empirical Study of Context Files for Agentic Coding
Paper • 2511.12884 • Published • 28 -
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
Paper • 2503.11576 • Published • 157
-
Captain Safari: A World Engine
Paper • 2511.22815 • Published • 12 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77 -
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
Paper • 2512.14614 • Published • 73 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133
-
MagicWorld: Interactive Geometry-driven Video World Exploration
Paper • 2511.18886 • Published • 19 -
EvoVLA: Self-Evolving Vision-Language-Action Model
Paper • 2511.16166 • Published • 6 -
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Paper • 2511.17889 • Published • 5 -
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos
Paper • 2601.00393 • Published • 133