Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not just image generation the rlhf worsens too. Calibration (confidence on solving a question in relation to ability to solve that problem) went from excellent to non existent. and you can see from the report that the base model performed better on a number of tests. Basically a dumber model.


Not dumber. More biased.

Important distinction, especially if we're looking to push back out towards the Pareto Frontier of the problem.

RLHF is still very much in its infancy and does not maximize the bias-variance tradeoff by a long shot, in my personal experience.


My understanding is that OpenAI did indeed find diminished capability across a range of tasks after doing RLHF. You're correct to question this though - as I believe the opposite was true of GPT-3 where it improved certain tasks.

The benefits from a business perspective were still clear however, and of course the instruction-tuned GPT-4 model still outperformed GPT-3, in general.

There are probably some weird edge cases and nuances that I'm missing - and I'd be happy to be corrected.


No dumber. Sure more biased too if you want but also dumber. Open ai have indicated as much.


Also generally less creative and insightful.

"No I won't do it" becomes a good option no matter what if you turn safety too high.


Are you saying this specifically for the GPT-4 API endpoint compared to idealized described GPT-4 from the paper?


yes the public api (or on paid chatGPT) vs the base model from the paper




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: