Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Disagree that we have all pieces for AGI.

Memory for one, not only do models need to be able to have long term, short term memory but they also need to be able to selectively forget. Hallucinations are still a big problem, you can easily (unintentionally) put the models in situations where they make up facts. Context limits - comprehension limits are still effectively 8-10k even though the token limits have been raised to infinity.



Models already have long term memory, that's all your basic LLM is after all; a gigantic long-term memory with all the faults that come with a neural network, like lossy storage and imperfect memory retrieval.

But for AGI, we're indeed missing an short term memory system with the ability to record the passage of time and filter out information not relevant to the task at hand, but I don't think they should be neural networks like we humans have. Neural networks for storing information is the only thing biology had to work with, but that doesn't mean it's the best solution for AGI, and I don't think the path to AGI is an complicated end-to-end neural network model.

AGI, no matter the level of consciousness* you aim for, will probably end up being more like an OS where processes are agents that work together. You'd have long running agents, short running agents, agents that analyze data, agents that apply algorithms, agents that come up algorithms, agents that criticize and fact check, agents that classify memories of other agents, agents that produce data for other agents to use in generating new models and supervising agents and interface agents that runs continuously to interact with the world and / or users.

*= which i define as the ability to understand that you are an entity existing in an environment that can be affected by an action, and also the ability to understand that an observed change in the environment might have been due to a previous action that you remember doing. This understanding can come on different levels and is mainly due to how detailed and fleeting your short-term memories are.


Why "selectively forget" should be a piece for AGI?


I guess we should start with the fact that models currently have no ability to remember at all.

You either fine-tune which is a very lossy process that degrades generality or you do in-context learning/RAG. Forgetting in its current form would be eliminating obsolete context, not forgetting would be using 1 million input tokens to answer "what is 2+2?".

In any case, any external mechanic to selectively manage context would be far too limiting for AGI.


I think maybe this refers to unlearning wrong information?


Also abstracting. No need to remember every milliseconds in its lifetime and consult them in every query.


I can remember for example when I was wrong and how and still responding correctly, I don't have to forget my wrong answer to respond with the correct one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: