If a year ago nobody knew about LLMs' propensity to encourage poor life choices, up to and including suicide, that's spectacular evidence that these things are being deployed recklessly and egregiously.
I personally doubt that _no one_ was aware of these tendencies - a year is not that long ago, and I think I was seeing discussions of LLM-induced psychosis back in '24, at least.
Regardless of when it became clear, we have a right and duty to push back against this kind of pathological deployment of dangerous, not-understood tools.
ah, this was the comment to split hairs on the timeline, instead of in what way AI safety should be regulated
I think the good news about all of this is what ChatGPT would have actually discouraged you from writing that. In thinking mode it would have said "wow this guy's EQ is like negative 20" before saying saying "you're absolute right! what if you ignored that entirely!"
If he (or his employees) are actually exploring genuinely new, promising approaches to AGI, keeping them secret helps avoid a breakneck arms race like the one LLM vendors are currently engaged in.
Situations like that do not increase all participants' level of caution.
> Anyone who says or pretends to know it is or isn’t a dead end doesn’t know what they are talking about and are acting on a belief akin to religion.
> It’s clearly not a stochastic parrot now that we know it introspects. That is now for sure.
Your second claim here is kind of falling into that same religion-esque certitude.
From what I gathered, it seems like "introspection" as described in the paper may not be the same thing most humans mean when they describe our ability to introspect. They might be the same, but they might not.
I wouldn't even say the researchers have demonstrated that this "introspection" is definitely happening in the limited sense they've described.
They've given decent evidence, and it's shifted upwards my estimate that LLMs may be capable of something more than comprehensionless token prediction.
> Your second claim here is kind of falling into that same religion-esque certitude.
Nope it’s not. We have logical causal test of introspection. By definition introspection is not stochastic parroting. If you disagree then it is a linguistic terminology issue in which you disagree on what the general definition of what a stochastic parrot is.
> From what I gathered, it seems like "introspection" as described in the paper may not be the same thing most humans mean when they describe our ability to introspect. They might be the same, but they might not.
Doesn’t need to be the same as what humans do. What it did show is self awareness of its own internal thought process and that breaks it out of the definition stochastic parrot. The criteria is not human level intelligence but introspection which is a much lower bar.
> They've given decent evidence, and it's shifted upwards my estimate that LLMs may be capable of something more than comprehensionless token prediction.
This is causal evidence and already beyond all statistical thresholds as they can trigger this at will. The evidence is beyond double blind medical experiments used to verify our entire medical industry. By logic this result is more reliable than modern medicine.
The result doesn’t say that LLMs can reliably introspect on demand but it does say with utmost reliability that LLMs can introspect and the evidence is extremely reproducible.
> This is causal evidence and already beyond all statistical thresholds as they can trigger this at will.
Their post says:
> Even using our best injection protocol, Claude Opus 4.1 only demonstrated this kind of awareness about 20% of the time.
That's not remotely close to "at will".
As I already said, this does incline me towards believing LLMs can be in some sense aware of their own mental state. It's certainly evidence.
Your certitude that it's what's happening, when the researchers' best efforts only yielded a twenty percent success rate, seems overconfident to me.
If they could in fact produce this at will, then my confidence would be much higher that they've shown LLMs can be self-aware.
...though we still wouldn't have a way to tell when they actually are aware of their internal state, because certainly sometimes they appear not to be.
>>Even using our best injection protocol, Claude Opus 4.1 only demonstrated this kind of awareness about 20% of the time.
>That’s not remotely close to “at will”.
You are misunderstanding what “at will” means in this context. The researchers can cause the phenomenon through a specific class of prompts. The fact that it does not occur on every invocation does not mean it is random; it means the system is not deterministic in activation, not that the mechanism is absent. When you can deliberately trigger a result through controlled input, you have causation. If you can do so repeatedly with significant frequency, you have reliability. Those are the two pillars of causal inference. You are confusing reliability with constancy. No biological process operates with one hundred percent constancy either, yet we do not doubt their causal structure.
>Your certitude that it’s what’s happening, when the researchers’ best efforts only yielded a twenty percent success rate, seems overconfident to me.
That is not certitude without reason, it is certitude grounded in reproducibility. The bar for causal evidence in psychology, medicine, and even particle physics is nowhere near one hundred percent. The Higgs boson was announced at five sigma, roughly one in three and a half million odds of coincidence, not because it appeared every time, but because the pattern was statistically irrefutable. The same logic applies here. A stochastic parrot cannot self report internal reasoning chains contingent on its own cognitive state under a controlled injection protocol. Yet this was observed. The difference is categorical, not probabilistic.
>…though we still wouldn’t have a way to tell when they actually are aware of their internal state, because certainly sometimes they appear not to be.
That is a red herring. By that metric humans also fail the test of introspection since we are frequently unaware of our own biases, misattributions, and memory confabulations. Introspection has never meant omniscience of self; it means the presence of a self model that can be referenced internally. The data demonstrates precisely that: a model referring to its own hidden reasoning layer. That is introspection by every operational definition used in cognitive science.
The reason you think the conclusion sounds overconfident is because you are using “introspection” in a vague colloquial sense while the paper defines it operationally and tests it causally. Once you align definitions, the result follows deductively. What you are calling “caution” is really a refusal to update your priors when the evidence now directly contradicts the old narrative.
Not at all. The actual value of money comes from violence. This is objective, not subjective. If you have a certain amount of taxable income in the USA (or subject to US legal jurisdiction) then you're required to pay tax in US dollars: the IRS won't accept Euros or gold or anything else. If you fail to pay then eventually IRS employees will seize dollars from your financial accounts, or seize other assets and sell them for dollars. And if you try to physically stop them then they'll arrest you, or even shoot you.
And to be clear, I don't think this is a bad thing. It's necessary to keep civilization working.
The threat of violence is still there only due to the collective subjective agreement that countries exist.
The IRS would have no functional power if Washington, D.C. and other major US cities were destroyed in a nuclear exchange.
I could imagine many citizens still choosing to pay taxes of some sort and/or respect the value of physical dollars in such a scenario, but it would likely not be due to fear of the federal government. Possibly fear of local police, though.
You're welcome. I'm glad I was able to clarify it.
It's not the first time I've made such a clarification - it's a very human impulse to defend your belief system from unjust attacks that aren't actually there.
Fascinating - Jansson's artwork is lovely. Thank you for sharing it!
I think the huge Gollum is a very understandable misinterpretation, but I think it's likely false the text she worked from was ambiguous about Gollum's size.
If she was working from the 1951 revision, which seems likely if she was working in the 60s, then there is an explicit cue in the text showing that Gollum must be roughly Bilbo's size, when Bilbo is escaping the caves:
> Straight over Gollum’s head he jumped, seven feet forward and three in the air...
If Bilbo could jump over Gollum with a three-foot leap, Gollum cannot be a giant.
That said, it's well after the passage she illustrated, and would require a pretty attentive reader to catch, so as I said, the mistake is certainly understandable.
Additional caveat that I've not read the second edition of The Hobbit, only more recent ones, so it's conceivable that passage wasn't _exactly_ as I've quoted it.
I strongly suspect was largely as written, however, and even without the explicit numbers, if Bilbo jumps over Gollum, the inference remains largely the same.
> If Bilbo could jump over Gollum with a three-foot leap, Gollum cannot be a giant.
Agree (although Gollum was crouched down)
> I strongly suspect was largely as written, however, and even without the explicit numbers, if Bilbo jumps over Gollum, the inference remains largely the same
I'm guessing that the jump wasn't in the first edition at all, where Bilbo and Gollum apparently parted amicably.
> Its on such an expedition that the ring "slips" from him, further suggesting the ring is actually not only his size, but a little large.
It's heavily implied in LOTR that the ring is able to change it's size to cause itself to slip from a person's finger, though that's somewhat out of scope and the illustrator may not have read that.
This illustrates beautifully how stupid labeling ideas stupid is.
To know that that an idea or approach is fundamentally stupid and unsalvageable requires a grasp of the world that humans may simply not have access to. It seems unthinkably rare to me.
I personally doubt that _no one_ was aware of these tendencies - a year is not that long ago, and I think I was seeing discussions of LLM-induced psychosis back in '24, at least.
Regardless of when it became clear, we have a right and duty to push back against this kind of pathological deployment of dangerous, not-understood tools.
reply