shenqi's picture

5 4

shenqi

shenqiorient

AI & ML interests

None yet

Organizations

None yet

upvoted 3 papers 3 months ago

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Paper • 2507.11097 • Published Jul 15, 2025 • 64

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

Paper • 2510.08211 • Published Oct 9, 2025 • 22

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

Paper • 2509.26354 • Published Sep 30, 2025 • 17

upvoted a paper 7 months ago

Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning

Paper • 2506.02867 • Published Jun 3, 2025 • 2

upvoted a paper 12 months ago

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published Jan 22, 2025 • 61