16 71 16

Yuzhen Huang

yuzhen17

https://hyz17.github.io

HYZ17

AI & ML interests

None yet

Recent Activity

upvoted a paper about 22 hours ago

Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems

authored a paper 2 days ago

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

upvoted a paper 2 days ago

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

View all activity

Organizations

upvoted a paper about 22 hours ago

Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems

Paper • 2602.08847 • Published 3 days ago • 15

authored a paper 2 days ago

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

Paper • 2602.07962 • Published 4 days ago • 24

upvoted a paper 2 days ago

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

Paper • 2602.07962 • Published 4 days ago • 24

reacted to danielhanchen's post with 🚀 13 days ago

Post

3389

You can now run Kimi K2.5 locally! 🔥

We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit.
Get >40 tok/s on 242GB or 622GB VRAM/RAM for near full precision.

GGUF: unsloth/Kimi-K2.5-GGUF

Guide: https://unsloth.ai/docs/models/kimi-k2.5

7 replies

upvoted a paper 30 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 224

authored a paper about 1 month ago

SWE-RM: Execution-free Feedback For Software Engineering Agents

Paper • 2512.21919 • Published Dec 26, 2025 • 10

upvoted a paper about 1 month ago

SWE-RM: Execution-free Feedback For Software Engineering Agents

Paper • 2512.21919 • Published Dec 26, 2025 • 10

upvoted a paper 2 months ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 296

upvoted 3 papers 3 months ago

authored a paper 3 months ago

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 46

upvoted a paper 3 months ago

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 46

upvoted 3 papers 4 months ago

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published Oct 24, 2025 • 101

A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning

Paper • 2510.15444 • Published Oct 17, 2025 • 148

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

Paper • 2509.24981 • Published Sep 29, 2025 • 29

updated a model 5 months ago

yuzhen17/llama2-42M-babylm

Updated Sep 20, 2025

published a model 5 months ago

yuzhen17/llama2-42M-babylm

Updated Sep 20, 2025

upvoted 2 papers 5 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18, 2025 • 111

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

Yuzhen Huang

AI & ML interests

Recent Activity

Organizations

yuzhen17's activity