Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are multiple variations of the model starting from 1.5B parameters.


Those are distillations of the model.


have you used those? in my experience even the 70B distillation is far worse than what you can expect from o1 / the R1 available on the web


No, I haven't. I've used Perplexity's R1 but I don't know how many parameters it has. It's quite good, although too slow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: