Tong Wu's picture

15 5

Tong Wu

wutong16

wutong16

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 months ago

V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

updated a model 3 months ago

wutong16/upload_1108

published a model 3 months ago

wutong16/upload_1108

View all activity

Organizations

None yet

upvoted a paper 2 months ago

V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

Paper • 2512.11799 • Published Dec 12, 2025 • 30

upvoted 2 papers 6 months ago

Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity

Paper • 2508.05609 • Published Aug 7, 2025 • 29

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Paper • 2508.04700 • Published Aug 6, 2025 • 52

upvoted a paper 8 months ago

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published Jun 5, 2025 • 55

upvoted a paper 9 months ago

Visual Agentic Reinforcement Fine-Tuning

Paper • 2505.14246 • Published May 20, 2025 • 32

upvoted a paper 10 months ago

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

Paper • 2504.07083 • Published Apr 9, 2025 • 22

upvoted 5 papers about 1 year ago

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Paper • 2502.08590 • Published Feb 12, 2025 • 42

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

Paper • 2412.12083 • Published Dec 16, 2024 • 12

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 97

FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models

Paper • 2412.07674 • Published Dec 10, 2024 • 20

Imagine360: Immersive 360 Video Generation from Perspective Anchor

Paper • 2412.03552 • Published Dec 4, 2024 • 29

upvoted a paper over 1 year ago

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Paper • 2410.16268 • Published Oct 21, 2024 • 69

upvoted 3 papers about 2 years ago

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Paper • 2401.16420 • Published Jan 29, 2024 • 55

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Paper • 2312.03818 • Published Dec 6, 2023 • 34

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image

Paper • 2312.04543 • Published Dec 7, 2023 • 22