SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution Paper • 2605.18401 • Published 9 days ago • 125
SEIF: Self-Evolving Reinforcement Learning for Instruction Following Paper • 2605.07465 • Published 19 days ago • 29
CIRCL/vulnerability-severity-classification-roberta-base Text Classification • 0.1B • Updated 7 days ago • 1.5k • • 11
Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published 27 days ago • 57
Yujie-AI/Llama3_8B_LLaVA-aim_v5-coeff1.0-samples500-merge_ratio0.8 Image-Text-to-Text • 8B • Updated 26 days ago • 59 • 1
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62
QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization Paper • 2604.05963 • Published Apr 7 • 8
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630