view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels drbh, danieldk β’ Aug 18, 2025 β’ 100
Unifying Demonstration Selection and Compression for In-Context Learning Paper β’ 2405.17062 β’ Published May 27, 2024 β’ 1
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling Paper β’ 2504.07053 β’ Published Apr 9, 2025 β’ 6
view article Article π― Liger GRPO meets TRL +4 shisahni, kashif, smohammadi, ShirinYamani, m0m0chen, liberty4321 β’ May 25, 2025 β’ 53
view reply Does Liger Kernel affect training speed at all? Is it faster, slower, or no difference compared to regular GRPO?
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper β’ 2505.09568 β’ Published May 14, 2025 β’ 99
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! β’ 14 items β’ Updated 11 days ago β’ 44