VibeSearchBench: Benchmarking Long-horizon Proactive Search in the Wild Paper • 2605.27882 • Published 10 days ago • 15
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 25 days ago • 195