Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It significantly outperformed competitors on those benchmarks. Around as much as the deltas between some others, which are considered significant.


The deltas between the others are mostly not significant either. They're all about equally good. There's no categorical difference between GPT-4 and Claude 3.5.


That's not true.


Okay what's the categorical difference? Which meaningful category includes one but not the other?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: