This is a fair point. When people talk about LLM writing they're always picking on its visible tics and clear flaws. It's a lot more uncomfortable to talk about the things it does better than most of us. There is a lot of precision in how they choose words and phrasing, especially top models like Opus. Lately I've had Opus explain some things to me I've never really been able to grasp otherwise, in fairly concise conversations.
I think this is on point, I've really started to think about LLMs in terms of attention budget more than tokens. There's only so many things they can do at once, which ones are most important to you?
Outputting "filler" tokens is also basically doesn't require much "thinking" for an LLM, so the "attention budget" can be used to compute something else during the forward passes of producing that token. So besides the additional constraints imposed, you're also removing one of the ways which it thinks. Explicit COT helps mitigates some of this, but if you want to squeeze out every drop of computational budget you can get, I'd think it beneficial to keep the filler as-is.
If you really wanted just have a separate model summarize the output to remove the filler.
This is true, but I also think the input context isn't the only function of those tokens...
As those tokens flow through the QKV transforms, on 96 consecutive layers, they become the canvas where all the activations happen. Even in cases where it's possible to communicate some detail in the absolute minimum number of tokens, I think excess brevity can still limit the intelligence of the agent, because it starves their cognitive budget for solving the problem.
I always talk to my agents in highly precise language, but I let A LOT of my personality come through at the same time. I talk them like a really good teammate, who has a deep intuition for the problem and knows me personally well enough to talk with me in rich abstractions and metaphors, while still having an absolutely rock-solid command of the technical details.
But I do think this kind of caveman talk might be very handy in a lot of situations where the agent is doing simple obvious things and you just want to save tokens. Very cool!
A brief look at certain native American tribes might show quite a lot of talking and consensus building, like if some war chief wants a war he needs to drum up support for that. Hours of talking ensue! Not to say that ancient tribes didn't have the worst of what modern corporations have to offer as far as leadership goes, but a claim "basically every village" is basically wrong, or "bascially" is carrying a heck of a lot of weight.
All Native tribes have been thoroughly dominated and decimated into being constrained to reservations by waves of brutal colonists, that were, as I said, the most ruthless.
Except where the lucky win, or the most ruthless actually suck at winning wars (Bret Devereaux has some interesting observations here, and a study of the various "unbeatable" and of course ruthless empires may also be educational), or where various species cooperate in various ways, or where nobody cries when Mr. Ruthless mutters "rosebud" then dies. "Well, it couldn't have happened to a better chap", said the butler.
Cooperation and Competition are typically temporal periods but only one is extractive irrespective of externalities where cooperation can be mutual but with devastating externalities (ecological collapse)
So unfortunately it’s insufficient to simply be cooperative and the fact that macro level cooperation appears to be rare in the universe
Further, the existing examples of mutual cooperative organizations are so rare as to be non-existent. Humans seem to prefer (or are biologically limited to preferring) competition based social and economic structures.
Read some Charles Mann. Tribal leaders if they can really be described as leaders had to work with consensus and cooperation. Modern society is much more coercive.
Among the Cherokee councils--which included both men and women--unanimous agreement was required for any group decision.
In their society, and in so many others who have been crushed by the forces of empire over the eons, the leaders of the people did not get there by murdering their way to the top. They were respected persons who were elevated to that position by the people.
Tradition tells us the Cherokee did once have a heriditary priestly class, who were called the Ani-kutani, or Nicotani. The people long suffered under their arrogance until one of them went too far, raping a woman while her husband was away. Her husband then amassed an uprising of the people and they killed out the Nicotani to the last man.
(An existence proof that it can be done, if nothing else.)
They write code differently but that doesn't mean that's the kind of code they prefer to read. Don't ascribe too much intention to a stochastic process.
Their coding style is above all else a symptom of their very limited context window and complete amnesia for anything that's not in the window.
I don't think there's intention. And yes, its output is defined by its limits. But it's not just the context, is it? Their coding style is, above all else, a result of an algorithm and input. The training data, the reinforcement, the model design, the tuning, the prompt, the context. Change any one of those things and the code changes. They are a system, like an ecosystem. Let water flow and it finds its own path. But try to dam it and it creates unintended consequences. I think what we're going to find is some of our rules apply more to a human world than an LLM world.
Good points. What local models have you found work best for your use cases? I feel like if we get to opus 4.6 level intelligence running on local hardware, we’re in the clear for a lot of day to day use cases.
Every contract I have to agree to these days has a "valid until unilaterally invalidated" clause. It feels like we're all just going through the motions.
Exactly. There's a clear alternative in my mind, one I'm sure is objectionable in its own way but I think is the least evil of the three: require providers to label their content and make them liable for it. This allows parents to do the censoring, which is functionally impossible now because no parent can fight the slippery power of multibillion dollar software investments designed to prevent them from having control over what their kids see.
I really feel like we have not encountered the same stupid people. Most stupid people I know respond to every question with some form of will-not-attempt. What's 74 times 2? Use a calculator! Should I drive or walk to the car wash? Not my problem! How many R's in strawberry? Who cares! They'll lose to the LLM 100%.
The cheapest Aliexpress calculator can multiply much bigger numbers than I can in my head, and it can do it instantly. Does that mean that the calculator is “smarter” than me?
Your true average human is someone like your barista at Starbuck's. Try giving them a good math problem, or logic puzzle, or leetcode problem if you need some reminding of the standard reasoning capabilities of our species. LLMs cannot beat the best humans at practically anything, but average humans? Average humans are a much softer target than this thread seems to think.
Completely disagree. Inability to handle specific math or CS is a matter of training and experience not reasoning and intelligence. The barista is quite capable at reasoning and learning feats the LLMs aren't close to
Yeah, there appears to be this idea that "being smart" is the same thing as "knowing facts", which I don't think is realistic.
I know plenty of people who are considerably smarter than me, but don't know nearly as much as I do about computer science or obscure 90's video game trivia. Just because I know more facts than they do (at least in this very limited scope) doesn't mean that they're less capable of learning than I am.
As you said, a barista is very likely able to reason about and learn new things, which is not something an LLM can really do.
I think it would be fairly easy to prove or disprove that 'AI as it is today knows more about any subject than 99% of HN'. But knowledge alone does not translate into intelligence and that's the problem: we don't have a really hard definition of what intelligence really is. There are many reasons for that (such as that it would require us to reconsider some of our past actions), but the fact remains.
So until we really once and for all nail down what intelligence is you get this god-of-the-gaps like problem where everytime we find something that looks and feels truly intelligent by yesterday's standards that intelligence will be crammed into a slightly smaller space excluding the thing that just became possible.
The rate-of-change is a factor here. Arguably the current rate of change is very high compared to with two decades ago, but compared to three years ago it feels as if we're already leveling off and we're more focused on tooling and infrastructure than on intelligence itself.
Intelligence may not actually have a proper definition at all, it seems to be an emergent phenomenon rather than something that you engineer for and there may well be many pathways to intelligence and many different kinds of intelligence.
What gets me about AI so far is that it can be amazing one minute and so incredibly stupid the next that it is cringe worthy. It gives me an idiot/savant kind of vibe rather than that it feels like an actual intelligent party. If it were really intelligent I would expect it to be able to learn as much or more from the interaction and to be able to have a conversation with one party where it learns something useful to then be able to immediately apply that new bit of knowledge in all the other ones.
Humans don't need to be taught the same facts over and over again, though it may help with long term retention. We are able to reason about things based on very limited information and while we get stuff wrong - and frequently so - we usually also know quite precisely where the limits of our knowledge are, even if we don't always act like it.
To me it is one of those 'I'll know it when I see it' things, and without insulting anybody, including the barista's at Starbucks, I think it is perfectly possible to have a discussion about this and to accept that average humans all have different skills and specialties and that some people work at Starbucks because they want to and others because they have to, it does not say anything per-se about their intelligence or lack thereof. At the same time you can be IQ 140 but still dumber than a Starbucks barista on what it takes to make someone feel comfortable and how to make coffee.
We seem to largely agree but I wanted to respond to this one bit:
> you get this god-of-the-gaps like problem where everytime we find something that looks and feels truly intelligent by yesterday's standards that intelligence will be crammed into a slightly smaller space excluding the thing that just became possible.
It's important to distinguish between "AI" and "AGI" here. I haven't seen many objections that the frontier models of the past year or so don't qualify as AI (whatever that might or might not mean) and the ones I have seen don't seem to hold much water.
However there's a constant stream of bogus claims presenting some new feat as "AGI" upon which each time we collectively stop and revise our working definition to close the latest loophole for something that is very obviously not AGI. Thus IMO legal loophole is a more fitting description than god of the gaps.
I do think we're nearing human level in general and have already exceeded it in specific tightly constrained domains but I don't think that was ever the common understanding of AGI. Go watch 80s movies and they've got humanoid robots walking around doing freeform housework while chatting with the homeowner. Meanwhile transferring dirty laundry from a hamper to the drum remains a cutting edge research problem for us, let alone wielding kitchen knives or handling things on the stovetop.
And yet if you asked that barista if you should walk to the car wash or take your car there, they would never respond with "you should take a walk, it's healthier than driving" like almost every LLM did in a test I saw.
That is as basic as everyday reasoning gets and any human in modern society solves hundreds of problems like that every day without even thinking about it, but with LLMs it's a diceroll. Testing them with leetcode problems or logic puzzles is not going to prove much unless you first made sure none of those were in the training data to prevent pure memorization.
reply