VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Paper • 2604.02486 • Published 7 days ago • 5
Running 34 Unfolding Robotics: Open-Source Shirt Folding from Data to Deployment 🤖 34 Explore a robot that folds clothes with open‑source tools
LightThinker++: From Reasoning Compression to Memory Management Paper • 2604.03679 • Published 5 days ago • 27
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 3 days ago • 78
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale Paper • 2604.04771 • Published 3 days ago • 97
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 3 days ago • 162
SkillX: Automatically Constructing Skill Knowledge Bases for Agents Paper • 2604.04804 • Published 3 days ago • 23
Test-Time Scaling Makes Overtraining Compute-Optimal Paper • 2604.01411 • Published 8 days ago • 19
FlowSlider: Training-Free Continuous Image Editing via Fidelity-Steering Decomposition Paper • 2604.02088 • Published 7 days ago • 5
unsloth/gemma-4-26B-A4B-it-GGUF Image-Text-to-Text • 25B • Updated about 11 hours ago • 993k • 334
UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving Paper • 2604.02190 • Published 7 days ago • 23
GPA: Learning GUI Process Automation from Demonstrations Paper • 2604.01676 • Published 7 days ago • 12