8 12 20

Yingfa Chen

chen-yingfa

https://chen-yingfa.github.io

AI & ML interests

Long-context modeling, continual learning, architectures

Recent Activity

authored a paper 11 days ago

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling

authored a paper 11 days ago

Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection

authored a paper 11 days ago

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

View all activity

Organizations

None yet

authored 4 papers 11 days ago

MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling

Paper • 2602.11761 • Published Feb 12 • 8

Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection

Paper • 2604.02819 • Published Apr 3 • 1

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

Paper • 2605.10933 • Published May 11 • 4

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

Paper • 2606.11052 • Published 13 days ago • 16

upvoted a paper 11 days ago

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

Paper • 2606.11052 • Published 13 days ago • 16

liked a model about 2 months ago

chen-yingfa/HypeNet-5B

5B • Updated Apr 28 • 5 • 1

updated a model about 2 months ago

chen-yingfa/HypeNet-5B

5B • Updated Apr 28 • 5 • 1

updated a collection about 2 months ago

HypeNet

Collection

The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts • 2 items • Updated Apr 28

published a model about 2 months ago

chen-yingfa/HypeNet-5B

5B • Updated Apr 28 • 5 • 1

updated a collection 3 months ago

HypeNet

Collection

The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts • 2 items • Updated Apr 28

liked a model 3 months ago

chen-yingfa/HypeNet-2B

2B • Updated Apr 7 • 86 • 2

updated a model 3 months ago

chen-yingfa/HypeNet-2B

2B • Updated Apr 7 • 86 • 2

published a model 3 months ago

chen-yingfa/HypeNet-2B

2B • Updated Apr 7 • 86 • 2

upvoted an article 3 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 909

liked a model 4 months ago

openbmb/MiniCPM-SALA

Text Generation • 9B • Updated May 7 • 5.84k • 682

liked a dataset 4 months ago

openbmb/UltraData-Math

Viewer • Updated Apr 15 • 181M • 35.6k • 319

liked a model 5 months ago

openbmb/MiniCPM-o-4_5

Any-to-Any • 9B • Updated May 19 • 306k • 1.4k

authored a paper 5 months ago

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Paper • 2601.22156 • Published Jan 29 • 15

upvoted a paper 5 months ago

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Paper • 2601.22156 • Published Jan 29 • 15

submitted a paper to Daily Papers 5 months ago

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Paper • 2601.22156 • Published Jan 29 • 15

Yingfa Chen

AI & ML interests

Recent Activity

Organizations

chen-yingfa's activity

Welcome Gemma 4: Frontier multimodal intelligence on device