> He saw that languages are inseparable from the context in which they are used
That is one of the things that stood out to me in Searle's summary of his later work because I consider how the transformer architecture works and the way in which the surrounding context plays into the meaning of the words.
> Part of learning those language games is experimenting in the real world
It is interesting that the article we are responding to talks about how we have only just begun to experiment with RL on top of transformers. In the same way that Alpha Go engaged in adversarial play we can envision LLMs being augmented to play language games amongst themselves. That may result in their own language, distinct from human language. But it also may result in the formation of intelligence surpassing human intelligence.
> The fidelity of text is simply too low to communicate the amount of information humans use to build up general intelligence.
This does not at all follow from anything I've encountered in Wittgenstein. It is an empirical claim that we (as in humanity) are going to test and not something that I would argue either one of us can know simply reasoning from first principles.
What does follow for me is closer to what Steven Pinker has been proposing in his own critiques of LLMs and AGI, which is that there is no necessary correlation between goal seeking (or morality) and intelligence. I also feel this is concordant with Wittgenstein's own work.
> Without the ability to interact with the physical world, LLMs will never be able to reach AGI
Again, a confident claim that is based on nothing other than your own belief. As I stated in my last comment, I am excited to see if that is empirically true or false. We are definitely going to scale up LLMs in the coming decade and so we are likely to find out.
My suspicion is that people don't want this scaling up to work because it would force them to let go of metaphysical commitments they have on both the nature of intelligence as well as the nature of reality. And for this reason they are adamantly disbelieving in even the possibility before the evidence has been gathered.
I'm happy to stay agnostic until the evidence is in. Thankfully, it shouldn't take too long so I may be lucky enough to find out in my own lifetime.
It's not a belief or logically derived claim, it's as close to empirical fact as we're going to get. We have zero evidence that intelligence without physical experimentation is possible because we have no other examples of intelligence except humans (and nonhuman animals), all of whom learned experimentally with a physical feedback loop. Even the most extreme cases like Helen Keller depended on it - her story is perhaps far more useful to grounding theories about AGI than any philosophical text as Wittgenstein himself would likely argue (Water!). His contempt, for lack of a better word, for philosophy on those terms is clear.
I'm excited to see how LLMs scale but it won't reach AGI without a much richer architecture that is capable of experimentation, capable of playing "language games" with other humans and remembering what it learned.
(I'm fairly certain of my views given my experience in neuroscience but it's fun to talk Wittgenstein in the context of LLMs, something that's been conspicuously missing. Sadly I don't believe discussions of AGI are fruitful, just what LLMs can teach us about the nature of language)
Now that I have a bit more time let me try a more substantive and less combative (and more drunk) reply :)
> That is one of the things that stood out to me in Searle's summary of his later work because I consider how the transformer architecture works and the way in which the surrounding context plays into the meaning of the words.
That's what makes transformer LLMs so interesting! Clearly they have captured a lot of what it means to be intelligent vis a vis language use, but is that enough to capture the kind of innate knowledge that defines intelligence at a human level? Based on my experiments with LLMs, it hasn't (yet). One of the clearest signs IMO is that there is no pedagological foundation to the LLM's answers. It can mimick explanations it learned on the internet but it cannot predict how to best explain a concept by implicitly picking up context from wrong answers or confusing questions. There is no "self reflection" because the algorithm as designed is incapable except for RLHF and finetuning.
> It is interesting that the article we are responding to talks about how we have only just begun to experiment with RL on top of transformers. In the same way that Alpha Go engaged in adversarial play we can envision LLMs being augmented to play language games amongst themselves. That may result in their own language, distinct from human language. But it also may result in the formation of intelligence surpassing human intelligence.
I think they already have their own language - embeddings! That really shines through with the multi-modal LLMs like GPT4V and LLaVa. What's curious is that we stumbled onto the same concept long before our algorithms showed any "intelligence" and it even helped Google move past the PageRank days. That's probably one of the fist steps towards intelligence but far from sufficient.
That brings up the fun question of what is enough to surpass human intelligence? I'm trying to apply LLMs to help make sense of the insane size of the American legal code and I can scale that process up to thousands of GPUs in a matter of seconds (as long as I can afford it). Even if it's at the level of a relatively dumb intern, that's a huge upside when talking about documents that would otherwise take years to read. Is that enough to claim intelligence, even if its not superior to a trained paralegal/lawyer?
>> The fidelity of text is simply too low to communicate the amount of information humans use to build up general intelligence.
> This does not at all follow from anything I've encountered in Wittgenstein. It is an empirical claim that we (as in humanity) are going to test and not something that I would argue either one of us can know simply reasoning from first principles.
Wittgenstein alone is not enough to come to this conclusion because it requires a peek at cognitive neuroscience and information theory which Mr W would have been woefully behind on given his time period. In short, just like the LLM "compresses" its training data to weights, all of human perception has to be compressed into language to communicate, which I think is impossible. We're talking about (age in years) * (365 days/year) * (X hours awake per day) * (500 megapixels per eye) * (2 eyes) + (all the other senses) versus however many bits it takes to represent language. I don't want to do the math on the latter cause I'm several beers in but it's not even close. 10 orders of magnitude wouldn't surprise me. 10 gigabytes of visual and other sensory input per 1 byte of language isn't out of the question.
I'm totally speculating and pulling numbers out of my ass here but the information theoretic part is undeniable: each human has access to more training data than it is possible for ChatGPT to experience. The quality of that training data ranges from "PEEKABOO!" to graduate textbooks, but its volume is incalculable and volume matters a lot to unsupervised algorithms like humans and LLMs.
> What does follow for me is closer to what Steven Pinker has been proposing in his own critiques of LLMs and AGI, which is that there is no necessary correlation between goal seeking (or morality) and intelligence. I also feel this is concordant with Wittgenstein's own work.
I haven't read Steven Pinkers critiques (could you link them please) so I can't say much about that. What does he mean by goal seeking?
IMO the only goal that matters is the will to survive, but lets assume for the sake of this discussion that it's not necessary to intelligence (otherwise we'll have to force our AGI bots to fight in a thunderdome and that's how we probably get Battlestar Gallactica all over again)
> My suspicion is that people don't want this scaling up to work because it would force them to let go of metaphysical commitments they have on both the nature of intelligence as well as the nature of reality. And for this reason they are adamantly disbelieving in even the possibility before the evidence has been gathered.
Let me be clear: I have zero metaphysical commitments and I can't wait until we come up with a richer vocabulary to describe intelligence. LLMs are clearly "intelligent", just not in any human sense quite yet. They don't have the will to survive, or any sense of agency, or even any permanence beyond the hard drive they exist on, but damn if they're not intelligent in some way. We just need better words to describe the levels of intelligence than "human, dog/cat/pig/pet, and everyone else"
However, I have some very strong physical commitments that must be met before I can even consider any algorithm as intelligent:
Neuroplasticity: human brains are incredibly capable of adapting all throughout life. That ranges from the simplest of drug tolerance to neurotransmitter attenuation/potentiation to the growth of new ion channels on cell membranes to very complex rewiring of axons and dendrites. That change is constant. It never stops, and LLMs don't have anything remotely like it. The brain rearchitects itself constantly and it's controlled as much by higher order processes as the neuron itself.
Scale: last time I did the math the minimum number of parameters required to represent the human connectome was over 500 quadrillion. 10+ quintillion is probably more accurate. That's 6-8 orders of magnitude more than we have in SOTA LLMs running on the best of the best hardware and Moore's law isn't going to take us that far. A 2.5D CPU/GPU might not even be theoretically capable of enough elements to simulate a fraction of that.
Quantization: I'm not sure neurons can be fully simulated with the limited precision of FP64, let alone FP32/16 or Q8/7/6/5/4. I've got far less evidence for this point than the others but it's a deeply held suspicion.
> I haven't read Steven Pinkers critiques (could you link them please) so I can't say much about that. What does he mean by goal seeking?
He has spoken about it multiple times on a few different podcasts. Here is a recent discussion he had with physicist David Deutsch [1] where he references this idea (See chapter timestamp for "Does AGI need agency to be 'creative'?").
That is one of the things that stood out to me in Searle's summary of his later work because I consider how the transformer architecture works and the way in which the surrounding context plays into the meaning of the words.
> Part of learning those language games is experimenting in the real world
It is interesting that the article we are responding to talks about how we have only just begun to experiment with RL on top of transformers. In the same way that Alpha Go engaged in adversarial play we can envision LLMs being augmented to play language games amongst themselves. That may result in their own language, distinct from human language. But it also may result in the formation of intelligence surpassing human intelligence.
> The fidelity of text is simply too low to communicate the amount of information humans use to build up general intelligence.
This does not at all follow from anything I've encountered in Wittgenstein. It is an empirical claim that we (as in humanity) are going to test and not something that I would argue either one of us can know simply reasoning from first principles.
What does follow for me is closer to what Steven Pinker has been proposing in his own critiques of LLMs and AGI, which is that there is no necessary correlation between goal seeking (or morality) and intelligence. I also feel this is concordant with Wittgenstein's own work.
> Without the ability to interact with the physical world, LLMs will never be able to reach AGI
Again, a confident claim that is based on nothing other than your own belief. As I stated in my last comment, I am excited to see if that is empirically true or false. We are definitely going to scale up LLMs in the coming decade and so we are likely to find out.
My suspicion is that people don't want this scaling up to work because it would force them to let go of metaphysical commitments they have on both the nature of intelligence as well as the nature of reality. And for this reason they are adamantly disbelieving in even the possibility before the evidence has been gathered.
I'm happy to stay agnostic until the evidence is in. Thankfully, it shouldn't take too long so I may be lucky enough to find out in my own lifetime.