> since prompts are constantly refined, how do you know it's still in use?
That's an unusual amount of leniency. If someone was tuning their system in such a way, why would you give them the benefit of the doubt that maybe now they've resolved all of the issues and will never do it again? It's like buying a tabloid every week because "what if they changed their ways and now it will all be truthful?"
The analysis you quoted doesn't really mean anything. I'm not sure how universally useful data can be extracted from having the models test themselves. But more importantly, all the models that were tested were made in the US, and it's extremely likely that from the selected data and the English-first approach, they would all skew towards an American perception of any issue. People from different corners of the world would identify the "center" as holding very different views from what you likely think. Also, being on the "center" isn't valuable unless you believe that being in the center is a merit in and of itself. If the best answer to an objective problem was a policy that's thought of as partisan, I would want a model to give me that correct partisan answer, instead of trying to both-sides everything or act like a contrarian whenever possible.
Assuming something is happening because of past actions is fine, it just isn’t proof.
And I’m not sure what your criticisms of the test are. The models didn’t test themselves, they were tested based on their responses.
And yes, it’s US based because that the intent - to see if there was a political bias based on the US political spectrum.
And the “center” is the most desirable output. Responses have to land somewhere on the political spectrum. Center means a balance between right and left wing.
That's an unusual amount of leniency. If someone was tuning their system in such a way, why would you give them the benefit of the doubt that maybe now they've resolved all of the issues and will never do it again? It's like buying a tabloid every week because "what if they changed their ways and now it will all be truthful?"
The analysis you quoted doesn't really mean anything. I'm not sure how universally useful data can be extracted from having the models test themselves. But more importantly, all the models that were tested were made in the US, and it's extremely likely that from the selected data and the English-first approach, they would all skew towards an American perception of any issue. People from different corners of the world would identify the "center" as holding very different views from what you likely think. Also, being on the "center" isn't valuable unless you believe that being in the center is a merit in and of itself. If the best answer to an objective problem was a policy that's thought of as partisan, I would want a model to give me that correct partisan answer, instead of trying to both-sides everything or act like a contrarian whenever possible.