As models improve, human preference will become worse as a proxy measurement (e....

		arb_ on Sept 13, 2024 \| parent \| context \| favorite \| on: Notes on OpenAI's new o1 chain-of-thought models As models improve, human preference will become worse as a proxy measurement (e.g. as model capabilities surpass the human's ability to judge correctness at a glance). This can be due to more raw capability - or more persuasion / charisma.