Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>LLMs can't be strategic because they do not understand the big picture

While I do tend to believe you, what evidence based data do you have to prove this is true?



> While I do tend to believe you, what evidence based data do you have to prove this is true?

IMO the onus is to prove that they can be strategic. Otherwise you're asking me to prove a negative.


Task Time horizons are improving exponentially with doubling times around 4 months per METR. At what timescale would you accept that they "can be strategic"? Theres little reason to think they wont be at multi week or month time horizons very soon. Do you need to be strategic to complete multi month tasks?


Can an LLM give you an upfront estimate that a task will take multiple months?

Can it decide intelligently what it would have to change if you said "do what you can to have it ready in half the time?"


>Can an LLM give you an upfront estimate that a task will take multiple months?

>Can it decide intelligently what it would have to change if you said "do what you can to have it ready in half the time?"

Do you think ChatGPT 5.2 Pro can't estimate how long a task might take? Do you think that estimate would necessarily be worse than the estimates, which are notoriously poor, coming from human engineers?

But you can still answer my question. When an LLM can complete a task that takes a person N months or years, is it capable of being strategic?


Multiple people have already answered your question in this thread. An LLM can’t be strategic because that’s not a capability of the technology itself


Saying the tiger has to prove it can eat you is not a great strategy to survive a tiger attack.


Well so far the tiger faceplants in an embarrassing fashion every time it tries to eat someone. So I'm not really worried about that.


Gary Marcus: "LLMs will never be able to a..... wait, what do you mean they can already do that?"


This is fundamental to the way LLMs work. If you don’t know this, you should be reading up on that instead of asking for evidence based data on things that are fundamental to the technology we’re discussing


Prompt-injection in all its forms. If the hyper-mad-libs machine doesn't reliably "understand" and model the difference between internal and external words, how can we trust them to model fancier stuff?


We can't even trust LLMs to get basic logic right, or even the language syntax sometimes. They reliably generate code worse than a human would write, and have zero reasoning ability. Anyone who thinks they can model something complicated is either uncritically absorbing hype or has a financial stake in convincing people of the hype.


That’s no problem. Just tell it to make sure the code it generates has no security vulnerabilities! That’ll take care of the issue




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: