Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published 13 days ago • 55
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19, 2025 • 231
Running 3.69k The Ultra-Scale Playbook 🌌 3.69k The ultimate guide to training LLM on large GPU Clusters
Benchmarking Optimizers for Large Language Model Pretraining Paper • 2509.01440 • Published Sep 1, 2025 • 25