Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've got some comment somewhere on HN that says exactly that "try CPU inference first, it's pretty good".

The need to reach for a T4 comes when someone is doing a big model on images or video and wants sub-second response time. (Think some of the stuff on Snapchat, etc.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: