It would take forever (or $$$) to train even 117M model from scratch. | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		p1esk on Aug 20, 2019 \| parent \| context \| favorite \| on: GPT-2: 6-Month Follow-Up It would take forever (or $$$) to train even 117M model from scratch.

JonathanFly on Aug 20, 2019 [–]

I read that meaning you can start with the actual pre-trained GPT-2 models but I never got an answer when I specifically asked if that was the case.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact