Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

3.7 did score higher in coding benchmarks but in practice 3.5 is much better at coding. 3.7 ignores instructions and does things you didn't ask it to do.


I suspect that is precisely why it got better at coding benchmarks.


3.7 is too overactive

I prefer Gemini 2.5 pro for all code now


Gemini 2.5 Pro has solved problems that Claude 3.7 cannot, so I use it for the hard stuff.

But Gemini is at least as overactive as Claude, sometimes even more overactive when it comes to something like comment spam.

Of course, this can be fixed with prompting. And sometimes it feels sheepish complaining about the machine god doing most of my chore work that didn't even exist a couple years ago.


2.5 is my “okay Claude can’t get it” but first I check my “bank account” to see if I can afford it.


Isn’t 2.5 pro significantly cheaper?


They're the same price, and Gemini has a large free tier.


Not when you’re doing 500k tokens per query.


I think it just does that to eat up your token quota and get you to upgrade.

Like, ask it a simple question and it comes up with a full repo, complete with a README and a Makefile, when all you wanted to know was how efficient a particular algorithm would be in the included code.

Can't wait until the add research to the Pro plan because, you know, I have questions...


> I think it just does that to eat up your token quota and get you to upgrade.

If you pay for a subscription then they don’t have an incentive to use more tokens for the same answer.

It’s definitely because feedback from people has “taught” it that more boilerplate is better. It’s the same reason ChatGPT is annoyingly complementary.


That has been the most annoying thing about it, so glad not paying for it anymore.


Can’t you still use Sonnet 3.5 anyway ? or is that a paying subscriber feature only ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: