Because everything past GPT 3.5 has been pretty unremarkable? Doubt anyone in the world would be able to tell a difference in a blind test between 4.0, 4o, 4.5 and 4.1.
I would absolutely take you on a blind test between 4.0 and 4.5 - the improvement is significant.
And while I do want your money, we can just look at LMArena which does blind testing to arrive at an ELO-based score and shows 4.0 to have a score of 1318 while 4.5 has a 1438 - it's over twice likely to be judged better on an arbitrary prompt, and the difference is more significant on coding and reasoning tasks.