OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 9 days ago • 305
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Paper • 2601.06002 • Published Jan 9 • 53
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published Jan 13 • 52
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 25 days ago • 54
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published Jan 8 • 55
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published Dec 31, 2025 • 64
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 24 days ago • 72
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 95
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model Paper • 2512.13507 • Published Dec 15, 2025 • 39
LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 86
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published Dec 17, 2025 • 44
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed Paper • 2512.14067 • Published Dec 16, 2025 • 15
Rethinking Expert Trajectory Utilization in LLM Post-training Paper • 2512.11470 • Published Dec 12, 2025 • 9