I am confused about why Gary Marcus thinks it's so obvious that Claude isn't conscious. As he points out, Dawkins is just taking a bog-standard behaviorist position: that he can't distinguish Claude from a conscious being just by the behavior here.
Marcus is saying "Well, if you knew they were trained to mimic, then you'd understand it's just mimicry and not real consciousness" The problem with this argument is that we just don't have a good idea what "real consciousness" is. What if, in order to simulate human text prediction with sufficient accuracy, the model has to assemble sub-networks internally into something equivalent to a conscious mind? We could disprove that kind of thing really quickly if we knew how to define consciousness really well, but we kinda don't!
Philosophers are genuinely split on this question, it's totally reasonable to be on either side of this based on your personal intuition. Marcus's position seems to be actually based on his own personal incredulity, despite his claims that understanding LLM training methodology gives him some special insight into the internal experience (or lack thereof) of an LLM.
Gary Marcus here is making an argument about souls and just doesn't realize it. You could rewrite this whole post replacing "consciousness" with "soul" and it would flow almost the same.
He handwaves consciousness as "internal states" as if that means anything and as if an LLM has no internal state. (This seems to be the analog for "divine touch".) He can't define consciousness rigorously, partly because we don't at all understand consciousness, but also because any attempt to do so would allow a scientific response.
And honestly pretty great, unless you are a collector. It's well done.
The book itself is beautiful and haunting. But I don't think it's for everyone... I have a copy, and I gifted one to someone in my family who really didn't understand the point.
Sometimes, in the interest of having something rather than nothing, I have to press publish. This entails getting things wrong, which is regrettable.
I will say, that I'm trying to steelman the code-as-assembly POV, and I dont think the exact historical analogy is critical to it being right or wrong. The main thing is that "we've seen the level of abstraction go up before, and people complained, but this is no different" is the crux. In that sense, a folk history is fine as long as the pattern is real
This is an interesting distinction, but it ignores the reasons software engineers do that.
First, hardware engineers are dealing with the same laws of physics every time. Materials have known properties etc.
Software: there are few laws of physics (mostly performance and asymptotic complexity). Most software isnt anywhere near those boundaries so you get to pretend they dont exist. If you get to invent your own physics each time, yeah the process is going to look very different.
For most generations of hardware, you’re correct, but not all. For example, high-k was invented to mitigate tunneling. Sometimes, as geometries shrink, the physics involved does change.
This just doesn't explain things by itself. It doesn't explain why humans would care about reasoning in the first place. It's like explaining all life as parasitic while ignoring where the hosts get their energy from.
Think about it, if all reasoning is post-hoc rationalization, reasons are useless. Imagine a mentally ill person on the street yelling at you as you pass by: you're going to ignore those noises, not try to interpret their meaning and let them influence your beliefs.
This theory is too cynical. The real answer has got to have some element of "reasoning is useful because it somehow improves our predictions about the world"
Skills wont use less context once invoked, the point is that MCP in particular frontloads a bunch of stuff into your context on the entire api surface area. So even if it doesn't invoke the mcp, it's costing you.
That's why it's common advice to turn off MCPs for tools you dont think are relevant to the task at hand.
The idea behind skills us that they're progressively unlocked: they only take up a short description in the context, relying on the agent to expand things if it feels it's relevant.
What Stripe did for payments, Pylon is doing for the mortgage industry: We're taking a sleepy industry with backward technology and re-building the stack from the ground up. We're first-principles thinkers, and our team is small, talented and ambitious.
I'm hiring generalists who love coding and want to build something beautiful in an industry where technology written in the 90s is the norm. We're Series A, well funded, and we have traction with customers. Come to Menlo Park and help us turn the $13 trillion US mortgage industry into a set of APIs.
And how hard it is proves that zfs didn't make a bad choice in not trying the same. (though it would be interesting if either had a goal of a clone - that is same on disk data structures. Interesting but probably a bad decision as I have no doubt there is something about zfs that they regret today - just because the project is more than 10 years old)
Marcus is saying "Well, if you knew they were trained to mimic, then you'd understand it's just mimicry and not real consciousness" The problem with this argument is that we just don't have a good idea what "real consciousness" is. What if, in order to simulate human text prediction with sufficient accuracy, the model has to assemble sub-networks internally into something equivalent to a conscious mind? We could disprove that kind of thing really quickly if we knew how to define consciousness really well, but we kinda don't!
Philosophers are genuinely split on this question, it's totally reasonable to be on either side of this based on your personal intuition. Marcus's position seems to be actually based on his own personal incredulity, despite his claims that understanding LLM training methodology gives him some special insight into the internal experience (or lack thereof) of an LLM.
(The Claude Delusion is a banger title though)
reply