Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators Paper • 2403.16950 • Published Mar 25, 2024 • 4
opensearch-project/opensearch-neural-sparse-encoding-v1 Feature Extraction • 0.1B • Updated Jun 30, 2025 • 3.33k • • 13
intfloat/multilingual-e5-large-instruct Feature Extraction • 0.6B • Updated Jul 10, 2025 • 926k • • 606
Running on CPU Upgrade 13.8k Open LLM Leaderboard 🏆 13.8k Track, rank and evaluate open LLMs and chatbots
Runtime error Featured 194 Low-bit Quantized Open LLM Leaderboard 🏆 194 Track, rank and evaluate open LLMs and chatbots
Configuration error 122 Berkeley Function Calling Leaderboard 🏃 122 Compare AI model performance on function calling tasks
corrius/cross-encoder-mmarco-mMiniLMv2-L12-H384-v1 Text Classification • Updated Sep 28, 2023 • 532 • 2