Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Claude Opus 4.6 is the best possible model to use in this test, with the least sycophancy. OpenAI and Gemini models are bad in comparison.


ChatGPT thinking models are very good; the instant model is bad. Gemini is always desperate to find an answer, and will give you one no matter what.


Nope, I use GitHub Copilot (agentic mode) and I end up having to use the (more expensive) Claude model because ChatGPT never second-guesses me or even itself. Gemini is slightly worse though.


For a less biased source, check out BSBench (where Claude dominates, and the highest rating GPT is 2x worse): https://petergpt.github.io/bullshit-benchmark/viewer/index.v...


I have access to the ChatGPT account of my boss and it is unusable sycophancy slop, horrible to read because every information is buried under endless emojis and the like. And it is almost indistinguishable if the LLM is wrong or right, every answer looks the same, often with a "my final answer" at the end. It's a mess.

I'm using Claude Opus 4.6 and it is much calmer, or "professional" in tone and much more information and almost no fluff.


Thank you for saying this.. ChatGPT is SO BAD. I suspect anyone that says OpenAI models are good are either lying or botting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: