Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seems like you don’t have to train from scratch. You can just distil a new model off an existing one by just buying api credits to copy the model.


"Just" is doing a lot of heavy lifting there. It definitely helps with getting data but actually training your model would be very capital intensive, ignoring the cost of paying for those outputs you're training on.


Your "API credits" don't buy the model. You just buy some resource to use the model that is running somewhere else.


What the parent poster means is that you can use the API to generate many question/answer pairs on which you then train your own model. For a more detailed explanation of this and other related methods, I can recommend this paper: https://arxiv.org/pdf/2402.13116


You don't understand what Gigachad is talking about. You can buy API credits to gain access to a model in the cloud, and then use that to train your own local model though a process called distilling.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: