In a Training Loop 🔄
lewtun
·
AI & ML interests
LLMs, LLMs, LLMs
Recent Activity
Organizations
lewtun/Llama-3.1-8B-SFT-LoRA-packing-no-lm-head
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-no-packing
Updated
lewtun/Llama-3.1-8B-SFT-QLoRA-packing
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing-no-saved-modules
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing
Updated
lewtun/Llama-3.1-8B-SFT-LoRA-packing-pad-token-eos
Updated
lewtun/Llama-3.1-8B-SFT-QLoRA-packing-pad-token-eos
Updated
lewtun/Llama-3.1-8B-SFT-full-packing
Text Generation
• 8B • Updated • 2
lewtun/Llama-3.1-8B-SFT-LoRA
Updated
Text Classification
• 0.5B • Updated • 3
lewtun/gemma-2-2b-it-gkd-9b
Updated
lewtun/gemma-2-2b-it-gkd-27b
Updated
Text Generation
• 1.03M • Updated • 1
lewtun/sft_openassistant-guanaco
Updated
Text Classification
• 0.5B • Updated • 1
lewtun/pythia-6.9b-deduped-tldr-online-dpo
7B • Updated • 1
lewtun/qwen2-1.5B-ultrafeedback-online-dpo
2B • Updated • 2
lewtun/qwen2-0.5B-ultrafeedback-online-dpo
0.6B • Updated • 2
lewtun/pythia-2.8b-deduped-tldr-online-dpo
3B • Updated • 2
lewtun/qwen2-7B-ultrafeedback-online-dpo-bs-1
Updated
lewtun/qwen2-7B-ultrafeedback-online-dpo-bs-2
Updated
lewtun/qwen2-7B-ultrafeedback-online-dpo
Updated
lewtun/pythia-1b-deduped-tldr-online-dpo
1B • Updated • 2
lewtun/pythia-1b-tldr-online-dpo
Updated
lewtun/qwen2-0.5B-lr-5e-7
Updated