Sunny Sanyal's picture

5 1 3

Sunny Sanyal

Sunny111

·

https://sites.google.com/view/sunnysanyal/home

AI & ML interests

Efficient Training Recipes of Large Models (mostly LLMs)

Recent Activity

posted an update about 2 hours ago

Are you familiar with reverse residual connections or looping in language models? Excited to share my Looped-GPT blog post and codebase 🚀 https://github.com/sanyalsunny111/Looped-GPT TL;DR: looping during pre-training improves generalization. Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄

upvoted a paper 29 days ago

Pre-training Small Base LMs with Fewer Tokens

liked a model about 1 month ago

GuminiResearch/Gumini-1.5B-Base

View all activity

Organizations

Sunny111 's datasets

None public yet