>This is a pretty simple question to answer. Take two lists and compare them. Th...

roxolotl · 2025-09-21T13:17:15 1758460635

That’s the point the author is making. The LLMs don’t have the raw correct information required to accomplish the task so all they can do is provide a plausible sounding answer. And even if it did the way they are architected still can only results in a plausible sounding answer.

Dilettante_ · 2025-09-21T13:25:19 1758461119

They absolutely could have accomplished the task. The task was purposefully or ignorantly posed in a way that is known to be not suited to the LLM, and then the author concluded "the machine did not complete the task because it sucks."

Blahah · 2025-09-21T13:25:31 1758461131

Not really. This works great in Claude Sonnet 4.1: 'Please could you research a list of valid TLDs and a list of valid HTML5 elements, then cross reference them to produce a list of HTML5 elements which are also valid TLDs. Use search to find URLs to the lists, then use the analysis tool to write a script that downloads the lists, normalises and intersects them.'

Ask a stupid question, get a stupid answer.

Lapel2742 · 2025-09-21T13:34:28 1758461668

> This works great in Claude Sonnet 4.1: 'Please could you research a list of valid TLDs and a list of valid HTML5 elements, then cross reference them to produce a list of HTML5 elements which are also valid TLDs. Use search to find URLs to the lists, then use the analysis tool to write a script that downloads the lists, normalises and intersects them.'

Ok, I only have to:

1. Generally solve the problem for the AI

2. Make a step by step plan for the AI to execute

3. Debug the script I get back and check by hand if it uses reliable sources.

4. Run that script.

For what do I need the AI?

Dilettante_ · 2025-09-21T13:52:21 1758462741

Try doing all of that by hand instead. The difference is about half an hour to an hour of work plus giving your attention to such a minor menial task.

Also, you are literally describing how you are holding it wrong. If you expect the LLM to magically know what you want from it without you yourself having to make the task understandable to the machine, you are standing in front of your dishwasher waiting for it to grow arms and do your dishes in the sink.

Lapel2742 · 2025-09-21T16:23:03 1758471783

> you are standing in front of your dishwasher waiting for it to grow arms and do your dishes in the sink.

No. I'm standing in front of the dishwasher and the dishwasher expects me to tell it in detail how to wash the dishes.

This is not about if you can find any use for a LLM at all. This is about:

> LLMs are still surprisingly bad at some simple tasks

And yes. They are bad if you have to hand feed them each and every detail for an extremely simple task like comparing two lists.

You even have to debug the result because you cannot be sure that the dishwasher really washed the dishes. Maybe it just said it did.

Dilettante_ · 2025-09-21T17:04:12 1758474252

>Hand feed them every detail for an extremely simple task like comparing two lists

You believe 57 words are "each and every detail", and that "produce two full, exhaustive lists of items out of your blackbox inner conceptspace/fetch those from the web" are "extremely simple tasks"?

Your ignorance of how complex these problems are misleads you into believing there's nothing to it. You are trying to supply an abstraction to a system that requires a concrete. You do not even realize your abstraction is an abstraction. Try learning programming.

Lapel2742 · 2025-09-21T18:10:52 1758478252

> You believe 57 words are "each and every detail", and that "produce two full, exhaustive lists of items out of your blackbox inner conceptspace/fetch those from the web" are "extremely simple tasks"?

Sure they are. I'm not interested in how difficult this is for a LLM. This is not the question. Go out there, get the information. That this is hard for a LLM proves the point: They are surprisingly bad at some simple tasks.

> Try learning programming.

I started programming in the early 1980's.

Dilettante_ · 2025-09-21T19:22:51 1758482571

>I'm not interested in how difficult this is for a LLM. This is not the question.

And neither was that my point. It is a complex problen, full stop. Again, your own inability to look past your personal abstractions ("just do the thing, it's literally one step dude") is what makes it feel simple. You ever do that "instruct someone to make coffee" exercise when you started out? What you're doing is saying "just make the coffee", refusing to decompose the problen any further, and then complaining that the other person is bad at following instructions.

Blahah · 2025-09-21T13:49:01 1758462541

The work. It intelligently provides the labor, it doesn't replace your brain. It runs the script itself.

Lapel2742 · 2025-09-21T13:23:59 1758461039

> No "lists" were being compared.

How would you solve that problem? You'd probably go to the internet, get the list of TLDs and the list of HTML5-Element and than compare those lists.

The author compares three commercial large‑language models that have direct internet access, but none of them appear capable of performing this seemingly simple task. I think his conclusion is valid.