DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis Paper • 2508.20033 • Published Aug 27, 2025 • 10
Drowning in Documents: Consequences of Scaling Reranker Inference Paper • 2411.11767 • Published Nov 18, 2024 • 19
Text2SQL is Not Enough: Unifying AI and Databases with TAG Paper • 2408.14717 • Published Aug 27, 2024 • 26
Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates Paper • 2206.00832 • Published Jun 2, 2022
Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion Paper • 2406.11196 • Published Jun 17, 2024 • 8
Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding Paper • 2402.05109 • Published Feb 7, 2024 • 2
RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network Paper • 2206.14098 • Published Jun 28, 2022
SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models Paper • 2303.10464 • Published Mar 18, 2023 • 1
Sparse Iso-FLOP Transformations for Maximizing Training Efficiency Paper • 2303.11525 • Published Mar 21, 2023 • 1
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Paper • 2405.03594 • Published May 6, 2024 • 7
LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms Paper • 2311.13133 • Published Nov 22, 2023
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining Paper • 2312.17482 • Published Dec 29, 2023 • 1
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Paper • 2402.10176 • Published Feb 15, 2024 • 38