Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Schaeffer et al. "Mirage" paper showed that many claimed emergent abilities disappear when you use different metrics, what looked like sudden capability jumps were often artifacts of using harsh/discontinuous measurements rather than smooth ones.

But I'd go further: even abilities that do appear "emergent" often aren't that mysterious when you consider the training data. Take instruction following - it seems magical that models can suddenly follow instructions they weren't explicitly trained for, but modern LLMs are trained on massive instruction-following datasets (RLHF, constitutional AI, etc.). The model is literally predicting what it was trained on. Same with chain-of-thought reasoning - these models have seen millions of examples of step-by-step reasoning in their training data.

The real question isn't whether these abilities are "emergent" but whether we're measuring the right things and being honest about what our training data contains. A lot of seemingly surprising capabilities become much less surprising when you audit what was actually in the training corpus.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: