arxiv:2602.10693
Xiang Cheng
FFFc2
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 9 hours ago
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training upvoted a collection 1 day ago
dots.ocr authored
a paper
8 days ago
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training Organizations
None yet