Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, but even that can still be run (slowly) on cpu-only systems down to about 32gb. Memory virtualization is a thing. If you get used to using it like email rather than chat, it’s still super useful even if you are waiting 1/2 hour for your reply. Presumably you have a fast distill on tap for interactive stuff.

I run my models in an agentic framework with fast models that can ask slower models or APIs when needed. It works perfectly, 60 percent of the time lol.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: