4 17 8

Xiao Wang

SCZWangxiao

SCZwangxiao

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

Memory in the Age of AI Agents

upvoted a paper 8 days ago

Towards Scalable Pre-training of Visual Tokenizers for Generation

upvoted a paper 8 days ago

Step-GUI Technical Report

View all activity

Organizations

None yet

upvoted 3 papers 8 days ago

upvoted 2 papers 14 days ago

Rethinking Chain-of-Thought Reasoning for Videos

Paper • 2512.09616 • Published 28 days ago • 17

World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

Paper • 2512.05927 • Published Dec 5, 2025 • 11

upvoted a paper 17 days ago

CaptionQA: Is Your Caption as Useful as the Image Itself?

Paper • 2511.21025 • Published Nov 26, 2025 • 27

upvoted a paper 18 days ago

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Paper • 2512.03534 • Published Dec 3, 2025 • 20

upvoted 4 papers about 1 month ago

Video Generation Models Are Good Latent Reward Models

Paper • 2511.21541 • Published Nov 26, 2025 • 45

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 161

EmoVid: A Multimodal Emotion Video Dataset for Emotion-Centric Video Understanding and Generation

Paper • 2511.11002 • Published Nov 14, 2025 • 3

OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models

Paper • 2511.14582 • Published Nov 18, 2025 • 18

upvoted 2 papers about 2 months ago

VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models

Paper • 2511.02712 • Published Nov 4, 2025 • 4

Towards Universal Video Retrieval: Generalizing Video Embedding via Synthesized Multimodal Pyramid Curriculum

Paper • 2510.27571 • Published Oct 31, 2025 • 17

upvoted a paper 2 months ago

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published Oct 27, 2025 • 84

commented a paper 3 months ago

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3, 2025 • 97 •

upvoted a paper 3 months ago

UniVideo: Unified Understanding, Generation, and Editing for Videos

Paper • 2510.08377 • Published Oct 9, 2025 • 71

commented a paper 3 months ago

Self-Improvement in Multimodal Large Language Models: A Survey

Paper • 2510.02665 • Published Oct 3, 2025 • 20 •

upvoted a paper 3 months ago

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26, 2025 • 184

upvoted a paper 7 months ago

Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts

Paper • 2506.10357 • Published Jun 12, 2025 • 21

liked a model 8 months ago

vikhyatk/moondream-next

Text Generation • 9B • Updated Nov 24, 2025 • 20 • 48

Xiao Wang

AI & ML interests

Recent Activity

Organizations

SCZWangxiao's activity