NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 5 days ago • 32
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 3 days ago • 87
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 21 days ago • 95
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published 23 days ago • 30
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 17 days ago • 49
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 22 days ago • 19
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published 24 days ago • 67
Exploring MLLM-Diffusion Information Transfer with MetaCanvas Paper • 2512.11464 • Published 28 days ago • 12
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties Paper • 2512.11799 • Published 28 days ago • 29
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 116
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published Dec 5, 2025 • 28
LATTICE: Democratize High-Fidelity 3D Generation at Scale Paper • 2512.03052 • Published Nov 24, 2025 • 10
SIMA 2: A Generalist Embodied Agent for Virtual Worlds Paper • 2512.04797 • Published Dec 4, 2025 • 24
NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation Paper • 2512.05106 • Published Dec 4, 2025 • 15
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 167