Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't mean at the same time.

For a simple question, with RX 6800, I am observing ~50 tok/s on 8B models Deepseek 16B gives ~40 tok/s. 32B doesn't fit in memory



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: