Deventhedude 's Collections Finetune data
updated
Two Minds Better Than One: Collaborative Reward Modeling for LLM
Alignment
Paper
• 2505.10597
• Published
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for
Alignment with Human Values
Paper
• 2504.05535
• Published • 44
Viewer
• Updated • 133k • 9.38k
• 111
nvidia/Nemotron-RL-instruction_following
Preview
• Updated • 190
• 15
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer
• Updated • 2.93k • 117
• 15
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer
• Updated • 1.8k • 1.03k
• 26
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer
• Updated • 9.95k • 265
• 37
nvidia/Nemotron-RL-knowledge-mcqa
Viewer
• Updated • 686k • 934
• 12
nvidia/Nemotron-RL-math-OpenMathReasoning
Viewer
• Updated • 113k • 333
• 17
nvidia/Nemotron-RL-knowledge-openqa
Viewer
• Updated • 136k • 281
• 10
nvidia/Nemotron-RL-math-advanced_calculations
Viewer
• Updated • 6k • 118
• 11
nvidia/Nemotron-AIQ-Agentic-Safety-Dataset-1.0
Viewer
• Updated • 10.8k • 1.13k
• 16
nvidia/Nemotron-VLM-Dataset-v2
Viewer
• Updated • 4.58M • 17.8k
• 91
Viewer
• Updated • 40 • 924
• 29
google/code_x_glue_cc_code_completion_token
Viewer
• Updated • 178k • 503
• 9
google/code_x_glue_cc_cloze_testing_all
Viewer
• Updated • 176k • 258
• 6
google/code_x_glue_cc_clone_detection_big_clone_bench
Viewer
• Updated • 1.73M • 662
• 22
google/code_x_glue_ct_code_to_text
Viewer
• Updated • 1.01M • 4.36k
• 80
google/code_x_glue_tc_nl_code_search_adv
Viewer
• Updated • 281k • 450
• 11
TeichAI/claude-sonnet-4.5-high-reasoning-250x
Viewer
• Updated • 247 • 130
• 37
Idea2Plan: Exploring AI-Powered Research Planning
Paper
• 2510.24891
• Published
TGPR: Tree-Guided Policy Refinement for Robust Self-Debugging of LLMs
Paper
• 2510.06878
• Published • 1
FML-bench: A Benchmark for Automatic ML Research Agents Highlighting the
Importance of Exploration Breadth
Paper
• 2510.10472
• Published • 9
Scientific Algorithm Discovery by Augmenting AlphaEvolve with Deep
Research
Paper
• 2510.06056
• Published • 6
RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback
Paper
• 2510.06186
• Published
AlphaResearch: Accelerating New Algorithm Discovery with Language Models
Paper
• 2511.08522
• Published • 18
Viewer
• Updated • 169k • 36.9k
• 1.77k
open-thoughts/OpenThoughts3-1.2M
Viewer
• Updated • 1.2M • 25.3k
• 234
Preview
• Updated • 401
• 106
Viewer
• Updated • 14.8M • 46.7k
• 118
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long
Documents
Paper
• 2310.19923
• Published • 15
Viewer
• Updated • 200k • 3.39k
• 100
Viewer
• Updated • 52.5B • 992k
• 2.87k
rl-research/dr-tulu-sft-data
Viewer
• Updated • 13.1k • 208
• 29
Viewer
• Updated • 4.48B • 61.5k
• 816
miromind-ai/MiroVerse-v0.1
Viewer
• Updated • 228k • 7.45k
• 236
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer
• Updated • 3.91M • 4.4k
• 674
Viewer
• Updated • 61.6M • 238k
• 1.24k
Viewer
• Updated • 500 • 155k
• 313
nick007x/github-code-2025
Viewer
• Updated • 148M • 890
• 117
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
• 2508.06471
• Published • 212
Viewer
• Updated • 4.06k • 2.79k
• 199
natolambert/GeneralThought-430K-filtered
Viewer
• Updated • 338k • 2.47k
• 35
RJT1990/GeneralThoughtArchive
Viewer
• Updated • 431k • 1.01k
• 73
open-thoughts/OpenThoughts-114k
Viewer
• Updated • 228k • 104k
• 859
Viewer
• Updated • 516k • 7.48k
• 76
PrimeIntellect/SYNTHETIC-1
Viewer
• Updated • 1.99M • 3.94k
• 62
PrimeIntellect/synthetic-code-understanding
Viewer
• Updated • 60.6k • 34
• 20
PrimeIntellect/INTELLECT-3-SFT
Viewer
• Updated • 6.98M • 754
• 4
openbmb/InfLLM-V2-data-5B
Viewer
• Updated • 7.19M • 322
• 33
kenhktsui/open-react-retrieval-multi-neg-result-new-kw
Viewer
• Updated • 25.2k • 64
• 3
alwaysfurther/tiny-agent-with-tools
Viewer
• Updated • 27 • 73
Viewer
• Updated • 9 • 371
• 39
Viewer
• Updated • 68M • 20.5k
• 266
TuringEnterprises/Turing-Open-Reasoning
Viewer
• Updated • 50 • 173
• 192
TeichAI/claude-4.5-opus-high-reasoning-250x
Viewer
• Updated • 250 • 1.1k
• 391
PrimeIntellect/INTELLECT-3-RL
Viewer
• Updated • 70.7k • 3k
• 7
PrimeIntellect/Reverse-Text-RL
Viewer
• Updated • 1k • 9.1k
• 2
PrimeIntellect/Reverse-Text-SFT
Viewer
• Updated • 1k • 1.37k
• 3
PrimeIntellect/SYNTHETIC-2-Base-Code
Viewer
• Updated • 57.3k • 110
PrimeIntellect/SYNTHETIC-2-Base-Math
Viewer
• Updated • 105k • 20
• 1
PrimeIntellect/SYNTHETIC-2-Base
Viewer
• Updated • 465k • 82
• 9
PrimeIntellect/SYNTHETIC-2-Base-General-Reasoning
Viewer
• Updated • 165k • 12
• 1
PrimeIntellect/SYNTHETIC-2-SFT-verified
Viewer
• Updated • 105k • 377
• 11
PrimeIntellect/SYNTHETIC-2-Base-Answer-Critique
Viewer
• Updated • 50k • 10
• 2
PrimeIntellect/SYNTHETIC-2-Base-Instruction-Following
Viewer
• Updated • 87.5k • 15
PrimeIntellect/SYNTHETIC-2
Viewer
• Updated • 51.6k • 265
• 15
Viewer
• Updated • 30 • 24
Viewer
• Updated • 30 • 36
Viewer
• Updated • 500 • 77
PrimeIntellect/LiveCodeBench-v5
Viewer
• Updated • 279 • 194
arcee-ai/bfcl_v4_web_search
Viewer
• Updated • 100 • 175
• 6
Viewer
• Updated • 74.2k • 31
• 37
arcee-ai/general-dpo-datasets
Viewer
• Updated • 91.6k • 91
arcee-ai/synthetic-data-gen
Viewer
• Updated • 999k • 342
• 2
Viewer
• Updated • 10.4k • 120
Viewer
• Updated • 15.4k • 53
• 7
arcee-ai/reasoning-sharegpt
Viewer
• Updated • 29.9k • 89
• 23
Viewer
• Updated • 486k • 140
• 64
arcee-ai/infini-instruct-top-500k
Viewer
• Updated • 500k • 37
• 6
arcee-ai/cleaned-mlabonne-distilabel-truthy-dpo-v0.1-filtered
Viewer
• Updated • 663 • 10
Viewer
• Updated • 369k • 2.99k
• 156
Viewer
• Updated • 5k • 1.74k
• 100
Viewer
• Updated • 11.3k • 10.9k
• 179
glaiveai/glaive-function-calling-v2
Viewer
• Updated • 113k • 63.3k
• 510
Viewer
• Updated • 28k • 1.29k
• 46
Salesforce/xlam-function-calling-60k
Viewer
• Updated • 60k • 35.5k
• 633
HuggingFaceFW/fineweb-edu
Viewer
• Updated • 3.5B • 513k
• 1.13k
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch
Paper
• 2512.02395
• Published • 52
MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning
Paper
• 2510.08567
• Published
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
Paper
• 2511.19773
• Published • 10
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool
Use
Paper
• 2510.27363
• Published • 23
Ariadne: A Controllable Framework for Probing and Extending VLM
Reasoning Boundaries
Paper
• 2511.00710
• Published • 5
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models
Paper
• 2510.01623
• Published • 13
DeepEyesV2: Toward Agentic Multimodal Model
Paper
• 2511.05271
• Published • 47
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
Paper
• 2510.12801
• Published • 14
DeepAgent: A General Reasoning Agent with Scalable Toolsets
Paper
• 2510.21618
• Published • 103
Open Multimodal Retrieval-Augmented Factual Image Generation
Paper
• 2510.22521
• Published • 31
smolagents/android-control
Viewer
• Updated • 15.3k • 2.9k
• 14
smolagents/guiact-web-single
Viewer
• Updated • 13.3k • 234
• 1
Viewer
• Updated • 1.89k • 117
• 6
smolagents/hermes-function-calling-v1-formatted-code-agent
Viewer
• Updated • 9k • 162
• 3
smolagents/aguvis-stage-1
Viewer
• Updated • 459k • 1.74k
• 17
smolagents/aguvis-stage-2
Viewer
• Updated • 784k • 4k
• 29
Viewer
• Updated • 10.5k • 50
• 1
beyoru/ToolCall_synthetic_qwen3
Viewer
• Updated • 60k • 13
• 10
rogue-security/mcp-tool-use-quality-benchmark
Viewer
• Updated • 5k • 27
• 3
mlx-community/hermes-reasoning-tool-use
Viewer
• Updated • 51k • 46
• 5
TeichAI/gemini-3-pro-preview-high-reasoning-1000x
Viewer
• Updated • 1.02k • 98
• 78
Viewer
• Updated • 1.29B • 93.9k
• 370
allenai/Dolci-Instruct-SFT-Tool-Use
Viewer
• Updated • 228k • 240
• 16
nvidia/Nemotron-Content-Safety-Reasoning-Dataset
Preview
• Updated • 214
• 11
ai-safety-institute/AgentHarm
Viewer
• Updated • 468 • 4.45k
• 56
Viewer
• Updated • 1.27k • 3.25k
• 2
rootsautomation/ScreenSpot
Viewer
• Updated • 1.27k • 3.48k
• 48
Preview
• Updated • 270
• 16
Viewer
• Updated • 150 • 33
• 4
Viewer
• Updated • 300 • 1.79k
• 25
Preview
• Updated • 43
• 11
Viewer
• Updated • 503 • 38.5k
• 47