I agree that the effect can look similar, we both end up with a compressed repre...

ninetyninenine · 2025-09-27T02:50:43 1758941443

>LLM's summarization is a lossy compression algorithm that picks entities and parts that it deems "important" against its trained data, not only is lossy, it is wasteful as it doesn't curate what to keep or purge off accumulated experience, it does it against some statistical function that executes against a big blob of data it ingested during training. You could throw contextual cues to improve the summarization, but that's as good as it gets.

No it's not as good as it gets. You can tell the LLM to purge and accumulate experience into it's memory. It can curate it for sure.

"ChatGPT summarize the important parts of this text remove things that are unimportant." Then take that summary feed it into a new context window. Boom. At a high level if you can do that kind of thing with chatGPT then you can program LLMs to do the same thing similar to COT. In this case rather then building off a context window, it rewrites it's own context window into summaries.