Because everything past GPT 3.5 has been pretty unremarkable? Doubt anyone in th...

falcor84 · 2025-08-07T14:27:21 1754576841

I would absolutely take you on a blind test between 4.0 and 4.5 - the improvement is significant.

And while I do want your money, we can just look at LMArena which does blind testing to arrive at an ELO-based score and shows 4.0 to have a score of 1318 while 4.5 has a 1438 - it's over twice likely to be judged better on an arbitrary prompt, and the difference is more significant on coding and reasoning tasks.

apwell23 · 2025-08-07T14:09:57 1754575797

> Doubt anyone in the world would be able to tell a difference in a blind test between 4.0, 4o, 4.5 and 4.1.

But this isn't 4.6 . its 5.

I can tell difference between 3 and 4.

dwater · 2025-08-07T14:31:32 1754577092

That's a very Spinal Tap argument for why it will be more than just an incremental improvement.