Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Would be cool if the LLM can break up the request into sub-requests processable by LLMs. Current talk about agents mention some sort of router/orchestrator that delegates to other agents. But these can be another llm, another agent, another router itself or a simple tool call, etc - all function calls that wrap other llm-enabled sub components.

My feeling is that we have the pieces to build AGI. Like humans, we don't need a 400IQ person to solve all problems ('AGI'). What we have is coordination problems and in LLM land it's 'the glue' that's missing. Hopeful it's a matter of patterns/best-practices emerging.



Disagree that we have all pieces for AGI.

Memory for one, not only do models need to be able to have long term, short term memory but they also need to be able to selectively forget. Hallucinations are still a big problem, you can easily (unintentionally) put the models in situations where they make up facts. Context limits - comprehension limits are still effectively 8-10k even though the token limits have been raised to infinity.


Models already have long term memory, that's all your basic LLM is after all; a gigantic long-term memory with all the faults that come with a neural network, like lossy storage and imperfect memory retrieval.

But for AGI, we're indeed missing an short term memory system with the ability to record the passage of time and filter out information not relevant to the task at hand, but I don't think they should be neural networks like we humans have. Neural networks for storing information is the only thing biology had to work with, but that doesn't mean it's the best solution for AGI, and I don't think the path to AGI is an complicated end-to-end neural network model.

AGI, no matter the level of consciousness* you aim for, will probably end up being more like an OS where processes are agents that work together. You'd have long running agents, short running agents, agents that analyze data, agents that apply algorithms, agents that come up algorithms, agents that criticize and fact check, agents that classify memories of other agents, agents that produce data for other agents to use in generating new models and supervising agents and interface agents that runs continuously to interact with the world and / or users.

*= which i define as the ability to understand that you are an entity existing in an environment that can be affected by an action, and also the ability to understand that an observed change in the environment might have been due to a previous action that you remember doing. This understanding can come on different levels and is mainly due to how detailed and fleeting your short-term memories are.


Why "selectively forget" should be a piece for AGI?


I guess we should start with the fact that models currently have no ability to remember at all.

You either fine-tune which is a very lossy process that degrades generality or you do in-context learning/RAG. Forgetting in its current form would be eliminating obsolete context, not forgetting would be using 1 million input tokens to answer "what is 2+2?".

In any case, any external mechanic to selectively manage context would be far too limiting for AGI.


I think maybe this refers to unlearning wrong information?


Also abstracting. No need to remember every milliseconds in its lifetime and consult them in every query.


I can remember for example when I was wrong and how and still responding correctly, I don't have to forget my wrong answer to respond with the correct one.


And yes, I share the view / feeling that we basically got the AGI building blocks. Models will continue improving, but we can already get most of what we need just by orchestrating the latest generation of SOTA models. Crazy time to be alive!


> But these can be another llm

Yes! I share the feeling that once LLMs get good enough at some abstraction level, you can always put another "level" on top that should abstract what already works into bite sized pieces. Hassabis also mentions this in a recent podcast, different levels of abstraction. We'll probably see some tooling in this space shortly, to coordinate between the different levels. And then RL it and watch it demolish planning tasks benchmarks.

We might very well already be at the point where every level is achievable, we just have to glue them together.


I bet it can do that if hooked up to an agent system. Rate limits are still very restrictive now in the free API, but as soon as they make it available for more frequent use we'll find out.


"Would be cool if the LLM can break up the request into sub-requests processable by LLMs."

It almost certainly can. Try asking Gemini 2.5 Pro to do that and see what happens.


The llm itself doesn't even need to do that. The actual system / front end that people interact with can wrap that step. Plandex does it for example and has been doing it for longer than the integrated reasoning models existed.

I mean, it's nice when the models can integrate the step-by-step internally... but I feel people have been missing out on the complex interactions by expecting it all in one adhoc prompt.


I think the feeling is that for this to really be AGI, it has to take in a single prompt and then delegate behind the scenes to an enormous tree of sub-agents if needed.

One app that comes to mind is Google's Conversational agents. The routing is just done by referencing another agent in the instructions, no need to explicitly link beyond the prompt.


What parts of the stack or what patterns do you feel are missing? Where does your gut tell you is the 80/20?


Tools like Claude Code already do this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: