MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 6 days ago • 123
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering? Paper • 2603.15401 • Published 7 days ago • 17
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 11 days ago • 63
Flash-KMeans: Fast and Memory-Efficient Exact K-Means Paper • 2603.09229 • Published 14 days ago • 79
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 47
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published Jan 26 • 35
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published Dec 31, 2025 • 108
\$OneMillion-Bench: How Far are Language Agents from Human Experts? Paper • 2603.07980 • Published 15 days ago • 26
SkillOrchestra: Learning to Route Agents via Skill Transfer Paper • 2602.19672 • Published 29 days ago • 56
Multi-agent cooperation through in-context co-player inference Paper • 2602.16301 • Published Feb 18 • 24
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks Paper • 2602.12670 • Published Feb 13 • 56
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration Paper • 2602.03786 • Published Feb 3 • 89
Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook Paper • 2602.14299 • Published Feb 15 • 27
FeatureBench: Benchmarking Agentic Coding for Complex Feature Development Paper • 2602.10975 • Published Feb 11 • 19