LMs aren't AGI, but they show that the search for architecture is essentially se...

krackers · on Nov 13, 2022

>long term memory, short term memory

Doesn't that mean the architecture isn't settled? The current mechanism of appending the output of the model back into the prompt feels like a bit of a hack. I'm only a layman here but it seems transformer models can only propagate information layer-wise, adding some ability to propagate information across time like RNNs do might be useful for achieving longer-term coherence.