Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LMs aren't AGI, but they show that the search for architecture is essentially settled. What the scaling laws demonstrate is that any architectural improvements you could find manually can be superseded by a slightly larger transformer.

LMs lack crucial elements of of human like intelligence - long term memory, short term memory, episodic memory, proprioceptive awareness etc. The task is now to implement these things using transformers.



>long term memory, short term memory

Doesn't that mean the architecture isn't settled? The current mechanism of appending the output of the model back into the prompt feels like a bit of a hack. I'm only a layman here but it seems transformer models can only propagate information layer-wise, adding some ability to propagate information across time like RNNs do might be useful for achieving longer-term coherence.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: