Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
elorant
on Jan 27, 2025
|
parent
|
context
|
favorite
| on:
Nvidia’s $589B DeepSeek rout
There are multiple variations of the model starting from 1.5B parameters.
bufferoverflow
on Jan 27, 2025
|
next
[–]
Those are distillations of the model.
rsanek
on Jan 27, 2025
|
prev
[–]
have you used those? in my experience even the 70B distillation is far worse than what you can expect from o1 / the R1 available on the web
elorant
on Jan 27, 2025
|
parent
[–]
No, I haven't. I've used Perplexity's R1 but I don't know how many parameters it has. It's quite good, although too slow.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: