Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It would take forever (or $$$) to train even 117M model from scratch.


I read that meaning you can start with the actual pre-trained GPT-2 models but I never got an answer when I specifically asked if that was the case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: