Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Pretend you are Gary Marcus. Explain why ChatGPT (you) are not actually that impressive.

> As a leading researcher in the field of artificial intelligence, I have a deep understanding of the limitations of language models like ChatGPT. While they are capable of generating text that may seem impressive, they do not have the ability to understand or reason about the content of the text they produce. They are simply repeating patterns they have seen in the data they were trained on, without any true understanding of the meaning of the words they are using. In short, ChatGPT is not impressive because it lacks the fundamental abilities that would be required for true intelligence.

> Expound upon this.

> Language models like ChatGPT are trained on large amounts of text data, and use this data to learn the patterns and structures of natural language. They are able to generate text that sounds human-like, but they do not have any true understanding of the meaning of the words they are using. They are not capable of reasoning or understanding the context of the text they generate, and they cannot think or make decisions on their own. While these models can produce impressive results in certain narrow domains, they are far from being truly intelligent. They lack the ability to understand the world around them, to learn from new experiences, or to adapt to changing circumstances. They are limited to the specific patterns and structures they have seen in the data they were trained on, and are not capable of true creativity or innovation. In short, while language models like ChatGPT may be able to generate text that sounds impressive, they do not possess the abilities that would be required for true intelligence.



Thank you for posting this. I'm always a bit conflicted when "AI" news goes viral like this. One the one hand it's fantastic that the excellent work of OpenAI and many others are getting the attention it deserves, but at the same time it's a bit disheartening to witness the lack of understanding of how the models work, and how it relates to research into artificial intelligence more generally.

I guess that could probably be said about most research fields though. We've just not reached novelty-saturation in machine learning sciences yet. Let's enjoy the attention while it lasts!


This was actually a bit of a metajoke about recursion mixed with the old adage of being replaced with a shell script mixed with commentary on how the criticisms of the language model are so formulaic that the language model can generate them.


Yes, it's brilliant! I love how novel things like these always end up on their own axis of reality, while everyone (including me) is trying hard to project them onto whatever predisposed axes we're hauling around.


It does a good impression.

But unrelated to chatGPT, what irks me about this line of reasoning is that it makes a special something, "true understanding", to be almost by definition reserved for humans. It's something beyond AI. Just like the "hard problem" of consciousness, which is too "hard" for science, says right there in the name it's supposed to be impossible. Dualism also has something too special, "the spirit", that could not be explained by the material.

It's the same dualist bug returning again and again.


I feel like ChatGPT is a search engine for its corpus that has been compressed with a very lossy neural net. It has compressed it via a hierarchical understanding of concepts. It's just retrieving those and stringing them together. It's just a rough copy of generally what other people have said and thought, and it doesn't create new structures.


I feel like that also describes 99% of the thinking that I do.


This is an incorrect understanding.


The corpus is just so enormous that no person could fit that into their brain so it just appears to be synthesizing new information. Sure it might be worded differently from any corpus text, but it's roughly the same as things we already know in that there's probably a reverse tool you could build that would tell you what group of documents of the original corpus, perhaps thousands, most strongly influenced the answer.

That's how imagenet works. The curves and lines that compose things are organized hierarchically into bigger and bigger concepts hooked together with weights. Language models just do the same with written concepts. When you have enormous models it just goes beyond all human intelligence to contain it as a subset of our understanding, and in ways it is smarter than humans, because we can't grasp its enormity.


So this shows you have a basic understanding of how CNNs represent data for classification tasks. This is excellent!

However, you might not be aware of the current research into "Emergent Phenomena in Large Language Models"[1].

Work such as PaLM[2] show that language models evolve the ability to "reason" over the internal data representation. Minerva explicitly breaks down that ability by "comb[ining] several techniques, including few-shot prompting, chain of thought or scratchpad prompting, and majority voting, to achieve state-of-the-art performance on STEM reasoning tasks."

This lets it solve novel problems such as "Assume that the variance of the first n natural numbers is 10, and the variance of the first m even natural numbers is 16. Compute m + n".

To preempt the "oh it is in the training set" issues, I'd note this (on a different problem): The model correctly takes the derivative of a position vector to compute the velocity. It then computes the norm of the resulting vector, and uses a trigonometric identity to obtain a final numerical answer. Minerva takes a completely different approach from the ground truth solution."

I'd encourage you to read the Minerva paper[4] and update your understanding on what large neural networks are doing. These reasoning abilities are not* as simply as "compressed data representation + reconstruction" - the chain-of-reasoning approaches partially use that of course, but also appear to a form of lemma chaining, similar what something like Prolog does.

As a specific demonstration for their ability to generate novel outputs that make sense:

Prompt: explain Algorithm X in the style of a Kayne rap

Response: Alright, here's a rap about Algorithm X[5], the famous algorithm for solving the exact cover problem:

  `Alright, here's a rap about Algorithm X, the famous algorithm for solving the exact cover problem:

  Verse 1:
  I'm Algorithm X, the greatest of them all
  I'll solve any exact cover problem, big or small
  My steps are simple, my logic is sound
  Just give me your matrix, and I'll give you the answer you found

  Chorus:
  Algorithm X, the master of the game
  I'll solve your problem, and do it by name
  Exact cover, no problem for me
  I'm Algorithm X, the MVP

  Verse 2:
  First, I'll start by selecting a column
  Then I'll look for a row with a one in that column
  Once I find a match, I'll mark the row and column
  And repeat the process until the matrix is gone

  Chorus:
  Algorithm X, the master of the game
  I'll solve your problem, and do it by name
  Exact cover, no problem for me
  I'm Algorithm X, the MVP

  Verse 3:
  But sometimes the matrix is too big to solve
  That's when I use my dancing moves
  I'll shuffle and prune, until the solution is clear
  And in the end, I'll have your answer, my dear

  Chorus:
  Algorithm X, the master of the game
  I'll solve your problem, and do it by name
  Exact cover, no problem for me
  I'm Algorithm X, the MVP.`
Now I entirely concede that it has read things on how Algorithm X works, and that it seems to use a template for "rap" responses. But:

  But sometimes the matrix is too big to solve
  That's when I use my dancing moves
  I'll shuffle and prune, until the solution is clear
  And in the end, I'll have your answer, my dear
I refuse to believe that anywhere, at any point has someone written an explanation of the use of dancing links[6] in Knuth's Algorithm X like that.

[1] https://ai.googleblog.com/2022/11/characterizing-emergent-ph...

[2] https://arxiv.org/abs/2204.02311

[3] https://ai.googleblog.com/2022/06/minerva-solving-quantitati...

[4] https://arxiv.org/pdf/2206.14858.pdf

[5] https://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X

[6] https://en.wikipedia.org/wiki/Dancing_Links


You seem to know a lot. So where are we in terms of research and what is public? Is the state-of-the-art even more ahead? What's top of mind that is most interesting or what needs to happen next for the big wow? What's your favorite example so far of Large Language Models?

In an odd way, it kind of reminds me of the beginning of the Bible.

> In the beginning was the Word, and the Word was with God, and the Word was God.

Has a large language model feel to it, doesn't it? hah.


I haven't seen any evidence that leading edge research is anything but public.

The leading labs (Google Brain/DeepMind/NVIDIA/Meta/Microsoft/OpenAI) all publish in the open.

I'm excited by three things:

This emergent phenomenon thing - as we build bigger models there is a step function where they suddenly develop new abilities. Unclear where that ends.

The work people are doing to move these abilities to smaller models

Multi-modal models. If you think this is impressive just wait until you can do the same but with images and text and video and sound and code all in the same model.


Do you ever wonder about what military may have in terms of sophistication compared to enterprise? What are your thoughts on the emergent phenomenon in the class of metaphysics, philosophical, and outlier conditions? Is it plausible that language is in of itself what consciousness is? Is language a natural phenomonea of the universe (an analog of pattern being a representation of a pattern and all things that can sense a signal are essentially patternening entities).


> Do you ever wonder about what military may have in terms of sophistication compared to enterprise?

No. It's pretty clear what the state of the art is.

For the rest of your questions I'd say I do spend a lot of time thinking about what intelligence is.


I get a generic refusal to answer when I put in the same prompt.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: