Great work and great article!
Regarding the maximum models size we can train using this approach, at the beginning of the article it's mentioned "models from 0.5B to 70B parameters" but at the end you write that "For large models (7B+), this HF skills job is not suitable", which order of magnitude is correct?
I suspect the max range is 7B, if it's the case, do you plan to support training of larger models?
Thanks!
Julien Jouganous
julienjouganous
AI & ML interests
None yet
Recent Activity
commented on
an
article
8 days ago
We Got Claude to Fine-Tune an Open Source LLM
Organizations
None yet