Alex Zelentsov
azelentsov
ยท
AI & ML interests
Language models
Recent Activity
upvoted
a
paper
about 2 months ago
Emergent Misalignment via In-Context Learning: Narrow in-context
examples can produce broadly misaligned LLMs