Tiny Aya Collection Bridging Scale and Multilingual Depth • 10 items • Updated about 15 hours ago • 30
Running Featured 28 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 28 Who needs 1T parameters? Olympiad proofs with a 4B model
ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning Paper • 2602.02192 • Published 15 days ago • 12
Surprisal Guided Selection Collection Training at test-time for kernel optimization • 2 items • Updated 5 days ago • 1
Surprisal Guided Selection Collection Training at test-time for kernel optimization • 2 items • Updated 5 days ago • 1