Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If this cannot eliminate hallucinations or at least reduce them to be statistically unlikely to be happen, and I assume it has more params than GPT4's trillion parameters, that means the scaling law is dead isn't it?


I interpret this to mean we're in the ugly part of the old scaling law, where `ln(x)` for `x > $BIGNUMBER` starts to becoming punishing, not that the scaling law is in any way empirically refuted. Maybe someone can crunch the numbers and figure out if the benchmarks empirically validate the scaling law or not, relative to GPT-4o (assuming e.g. 200 million params vs 5T params).


Why would they want to eliminate that? Altman said that hallucinations are how LLMs express creativity.


I mean the scaling laws were always logarithms, and logarithms become arbitrarily close to flat if you can't drive them with exponential growth, and even if you do it's barely linear. The scaling laws always predicted that model scaling would stop/slow being practical at some point.


Right but the quantum leap in capabilities that came from GPT2->GPT3->GPT3.5Turbo (which I personally felt didn't fare as well at coding as the former)->GPT4 won't be replicated anytime soon with the pure text/chat generation models.


Sure, that's also predicted by a logarithmic scaling law, you have extremely rapid growth until the inflection point.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: