MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution Paper • 2603.18718 • Published 8 days ago • 1
FinMCP-Bench: Benchmarking LLM Agents for Real-World Financial Tool Use under the Model Context Protocol Paper • 2603.24943 • Published 1 day ago • 2
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation Paper • 2603.25702 • Published 1 day ago • 4
MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models Paper • 2603.25744 • Published 1 day ago • 4
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published 22 days ago • 16
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Paper • 2603.25319 • Published 1 day ago • 25
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Paper • 2603.25502 • Published 1 day ago • 36
PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 1 day ago • 95
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 1 day ago • 84
Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning Paper • 2603.23404 • Published 3 days ago • 3
EVA: Efficient Reinforcement Learning for End-to-End Video Agent Paper • 2603.22918 • Published 3 days ago • 38
OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning Paper • 2603.24458 • Published 2 days ago • 4
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 2 days ago • 37
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 2 days ago • 83
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper • 2603.24533 • Published 2 days ago • 35
CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare Paper • 2603.24157 • Published 2 days ago • 8
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning Paper • 2603.21289 • Published 5 days ago • 17
VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs Paper • 2603.23481 • Published 3 days ago • 6