Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

After working with agent-LLMs for some years now, I can confirm that they are completely useless for real programming.

They never helped me solve complex problems with low-level libraries. They can not find nontrivial bugs. They don't get the logic of interwoven layers of abstractions.

LLMs pretend to do this with big confidence and fail miserably.

For every problem I need to turn my brain to ON MODE and wake up, the LLM doesn't wake up.

It surprised me how well it solved another task: I told it to set up a website with some SQL database and scripts behind it. When you click here, show some filtered list there. Worked like a charm. A very solved problem and very simple logic, done a zillion times before. But this saved me a day of writing boilerplate.

I agree that there is no indication that LLMs will ever cross the border from simple-boilerplate-land to understanding-complex-problems-land.





    I can confirm that they are completely useless for real programming
And I can confirm, with similar years of experience, that they are not useless.

Absolutely incredible tools that have saved hours and hours helping me understand large codebases, brainstorm features, and point out gaps in my implementation or understanding.

I think the main disconnect in the discourse is that there are those pretending they can reliably just write all the software, when anyone using them regularly can clearly see they cannot.

But that doesn't mean they aren't extremely valuable tools in an engineer's arsenal.


Same. I started coding before hitting puberty, and Im well into my 30s.

If you know the problem space well, you can let LLMs(I use Claude and ChatGPT) flesh it out.


> I use Claude and ChatGPT

Both for code? For me, it's Claude only for code. ChatGPT is for general questions.


Yes, I use them in tandem. Generally Claude for coding and ChatGPT when I run out of tokens in Claude.

I also use ChatGPT to summarise my project. I ask it to generate mark down and PDFs, explaining the core functionality.


I feel like I have to be strategic with my use of claude code. things like frequently clearing out sessions to minimize context, writing the plan out to a file so that I can review it more effectively myself and even edit it, breaking problems down into consumable chunks, attacking those chunks in separate sessions, etc. it's a lot of prep work I have to do to make the tool thrive. that doesn't mean it's useless, though.

"real programming"

Perhaps you're doing some amazing low-level work, but it feels like you're way overestimating how much of our industry does that. A massive amount of developers show up to work every day and just stitch together frameworks and libraries.

In many ways, it feels similar to EVs. Just because EVs aren't yet, and may never be, effective to moving massive amounts of cargo in a day with minimal refueling, doesn't mean that they aren't an effective solution for the bulk of drivers who have an average commute of 40 miles a day.


> After working with agent-LLMs for some years now, I can confirm that they are completely useless for real programming

This is a bit of no-true-scottsman, no? For you "real programming" is "stuff LLMs are bad at," but a lot of us out in the real world are able to effectively extract code that meets the requirements of our day jobs from tossing natural language descriptions into LLMs.

I actually find the rise of LLM coding depressing and morally problematic (re copyright / ownership / license laundering), and on a personal level I feel a lot of nostalgia for the old ways, but I simply can't levy an "it's useless" argument against this stuff with any seriousness.


I only use it sparingly thus far, and for small things, but I don't find it depressing at all - but timely.

All those many, many languages, frameworks, libraries, APIs and there many many iterations, soooo much time lost on minute details. The natural language description, even highly detailed down to being directly algorithmic, is a much better level for me. I have gotten more and more tired of coding, but maybe part of it is too much Javascript and its quickly changing environment and tools, for too many years (not any more though). I have felt that I'm wasting way too much time chasing all those many, many details for quite some time.

I'm not pro-high-level-programming per se - I started a long time ago with 8 bit assembler and knowing every one of the special registers and RAM cells. I cherish the memories of complex software fitting on a 1.44 MB floppy. But it had gotten just a bit too extreme with all the little things I had to pay attention to that did not contribute to solving the actual (business) problem.

I feel it's a bit early even if it's already usable, but I hope they can get at least one more giant leap out of AI in the next decade or so. I am quite happy to be able to concentrate on the actual task, instead of the programming environment minutiae, which has exploded in size and complexity across platforms.


"they are completely useless for real programming"

You and I must have completely different definitions of "real programming". In this very comment, you described a problem that the model solved. The solution may not have involved low-level programming, or discovering a tricky bug entrenched in years-worth of legacy code, but still a legitimate task that you, as a programmer, would've needed to solve otherwise. How is that not "real programming"?


I wouldn't describe the LLM's actions in the example as "solving a problem" so much as "following a well-established routine". If I were to, for instance, make a PB&J sandwich, I wouldn't say that what I'm doing is "real cooking" even if it might technically fit the definition.

If an LLM controlling a pair of robot hands was able to make a passable PB&J sandwich on my behalf, I _guess_ that could be useful to me (how much time am I really saving? is it worth the cost? etc.), but that's very different from those same robo-hands filling the role of a chef de cuisine at a fine dining restaurant, or even a cook at a diner.


In this analogy you're clearly a private chef with clients who have very specific wishes and allergies.

The rest of us are just pumping out CRUD-burgers off the API assembly line. Not exactly groundbreaking stuff.

LLMs are really good with burgers, but not so much being a private chef.


Every useful CRUD app becomes its own special snowflake with time and users.

Now if your CRUD app never gets any users sure it stays generic. But we’ve had low code solutions that solve this problem for decades.

LLMs are good at stuff that probably should have been low code in the first place, but couldn’t be for reasons. That’s useful, but it comes with a ton of trade offs. And these kind of solutions covet a lot less ground than you’d think.


I'm old enough to remember the "OMG low-code is going to take our jeeeerbbs!" panic :D

Like LLMs they took away a _very_ specific segment of software, Zapier, n8n, NodeRED etc. do some things in a way that bespoke apps can't - but they also hit a massive hard wall where you either need to do some really janky shit or just break out Actual Code to get forward.


"real programming" hits a "true scottsman" snare with me.

People are saying Codex 5.2 fullsolved crypto challenges in 39C3 CTF last weekend.

Three months ago I would have agreed with you, but anecdotal evidence says Codex 5.2 and Opus 4.5 are finally there.


You'll get a vastly different experience the more you use these tools and learn their limitations and how you can structure things effectively to let them do their job better. But lots of people, understandably, don't take the time to actually sit down and learn it. They spend 30 seconds on some prompt not even a human would understand, and expect the tooling to automatically spend 5 hours trying its hardest at implementing it, then they look at the results and conclude "How could anyone ever be productive with this?!".

People say a lot of things, and there is a lot of context behind what they're saying that is missing, so then we end up with conversations that basically boil down to one person arguing "I don't understand how anyone cannot see the value in this" with another person thinking "I don't understand how anyone can get any sort of value out of this", both missing the other's perspective.


Prompt engineering is just good transfer notes and ticket writing, which is something a majority of the devs I've worked with don't enjoy or excel at

I've been using Codex and Claude Sonnet for many months now for personal (Codex) and work (Sonnet) and I agree. Three months ago these tools were highly usable, now with Codex 5.2 and Sonnet 4.5 I think we're at the point where you can confidently rely on them to analyze your repo codebase and solve, at the very least, small scoped problems and apply any required refactor back throughout the codebase.

6-12+ months ago the results I was getting with these tools were highly questionable but in the last six months the changes have been pretty astounding


Sonnet is dumb as a bag of bricks compared to Opus, perhaps you meant Opus? I never use sonnet for anything anymore, it’s either too verbose or just can’t handle tasks which Opus one shots.

I use the Copilot extension in VS Code, which links back to my enterprise GitHub account, where I have Claude Sonnet 4.5 available amongst other things. I'm not familiar with Opus. I just open the Copilot Chat window in my VS Code, configure it to use Sonnet 4.5, tell it what I need and it writes the responses and code for me. I'm not using it for large tasks. Most of my usage is "examine this codebase and tell me how to fix xyz problem" or "look at this source code file and show me the code to implement some feature, make sure to examine the entire codebase for insight into how it should be integrated with the rest of the project"

There's other more advanced coding AI tools but this has accomplished most all of my needs so far


The Copilot extension in VS Code includes Opus as well. It costs three times as much as Claude, so I'd expect it to perform better or be able to handle more complex tasks, but if you're happy with Claude - I am too - more power to you.

s/Claude/Sonnet/

These anecdotes feel so worthless. I notice almost no difference between the two and get generally high quality results from either. This is also a worthless anecdote. I'm guessing what kind of codebase you are working in matters a lot as well as the tasks you're giving it.

> After working with agent-LLMs for some years now

Some years? I don't remember any agents being any good at all before just over 1 year ago with Cursor and stuff really didn't take off until Claude Code.

Which isn't to say you weren't working with agent-LLMs before that, but I just don't know how relevant anything but recent experience is.


> I can confirm that they are completely useless for real programming

Can you elaborate on "real programming" ?

I am assuming you mean the simplest hard problem that is solved. The value of the work is measured in those terms. Easy problems have boilerplate solutions and have been solved numerous times in the past. LLMs excel here.

Hard problems require intricate woven layers of logic and abstraction, and LLMs still struggle since they do not have causal models. The value however is in the solution of these kinds of problems since the easy problems are assumed to be solved already.


> After working with agent-LLMs for some years now, I can confirm that they are completely useless for real programming. > They never helped me solve complex problems with low-level libraries. They can not find nontrivial bugs. They don't get the logic of interwoven layers of abstractions.

This was how I felt until about 18 months ago.

Can you give a single, precise example where modern day LLMs fail as woefully as you describe?


i had to disable baby Ceph (Deepseek 3.1) from writing changes in Continue because he's like a toddler. But, he did confirm some solutions and wrote a routine and turn me on to some libraries, etc

so I see what you're saying. he comes up with the wrong answers a lot to a problem involving a group of classes in related files

however it's Continue, so it can read files in vs code which is really nice and that helps a lot with its comprehension so sometimes it does find the issue or at least the nature of the issue

I tend to give it bug n-1 to pre digest while I work on bug n


Claude is currently porting my rust emulator to WASM. It's not easy at all, it struggles, I need to guide it quite a lot but it's way easier to let him do it than me learning yet another tech. For the same result I have 50% the mental load...

It’s crazy how different my experience is. I think perhaps it’s incredibly important what programming language you are using, what your project and architecture is like. Agents are making an extraordinary contribution to my productivity. If they jacked my Claude Code subscription up to $500/month I would be upset but almost certainly would keep paying it, that’s how much value it brings.

I’m in enterprise ERP.


It sounds like you use your personal Claude Code subscription for work of your employer, but that is not something I would ever consider doing personally so I imagine I must be mistaken.

Can you elaborate slightly on what you pay for personally and what your employer pays for with regards to using LLMs for Enterprise ERP?


Freelancers regularly use tools such as Copilot and Claude, it's always handled professionally and in agreement with their customers. I've seen other freelancers do it plenty of times in the last 1-2 years at my customer sites.

Why so narrow minded?


I'm inquisitive not narrow minded.

The GP didn't mention anything about freelancing so unless you know them or stalked them you are perhaps being narrow minded here.


Even more important than those things, is how well you can write and communicate your ideas. If you cannot communicate your ideas so a human could implement it as you wanted it to without asking extra questions, a LLM isn't gonna be able to.

As someone who has managed engineers for many years I find those skills immediately applicable to the LLM domain. If you aren't used to communicating what you are trying to build to other engineers I think using the AI is harder as you need to develop those skills.

I'd take it a step further and say that for any engineer who is used to collaborating with others, engineers or not, should have these skills already, but as most of us know, communication is a generally lacking skill among the population at large, even among engineers too.

Natural language programming has arrived in my opinion. If you're not a developer or have any experience programming it won't help much

> After working with agent-LLMs for some years now, I can confirm that they are completely useless for real programming.

"completely useless" and "real programming" are load bearing here. Without a definition to agree on for those terms, it's really hard not to read that as you're trying to troll us by making a controversial unprovable claim that you know will get people that disagree with you riled up. What's especially fun is that you then get to sneer at the abilities of anybody making concrete claims by saying "that's not real programming".

How tiresome.


Who cares about semantics.

Ultimately it all boils down to the money - show me the money. OAI have to show money and so do its customers from using this tool.

But nope, the only thing out there where it matters is hype. Nobody is on an earnings call clearly showing how they had a numerical jump in operating efficiency.

Until I see that, this technology has a dated shelf life and only those who already generate immense cash flows will fund its continued existence given the unfavourable economics of continued reinvestment where competition is never-ending.


The "real programming" people are moving the goalposts of their no true scotsman fallacy so fast they're leaving Roadrunner style dust behind them.

Yes, there are things LLMs can't do at all, some where they are actively dangerous.

But also there are decently sized parts of "software development" where any above average LLM can speed up the process as long as whoever is using it knows hot to do so and doesn't fight the tool.


Who cares. Focus on what matters. OAI knows this considering they are dedicating a lot of their resources toward figuring out how to become profitable.

isnt OAI only unprofitable because they are putting all their money into more training?

the product market fit for LLMs is already clearly found, there's just no moat to it. tokens are a commodity


maybe they'll ask cut off the free tiers in 2026 and the only thing left will be China and open router

agreed. we should instead be sneering at the AI critics because "you're holding it wrong"

>After working with agent-LLMs for some years now, I can confirm that they are completely useless for real programming.

>They never helped me solve complex problems with low-level libraries. They can not find nontrivial bugs. They don't get the logic of interwoven layers of abstractions.

>LLMs pretend to do this with big confidence and fail miserably.

This is true for most developers as well. The mean software developer, especially if you outsource, has failure modes worse than any LLM and round-trip time is not seconds but days.

The promise of LLMs is not that they solve the single most difficult tasks for you instantly, but that they do the easy stuff well enough that they replace offshore teams.


> The promise of LLMs is not that they solve the single most difficult tasks for you instantly, but that they do the easy stuff well enough that they replace offshore teams.

But that's exactly the *promise* of LLMs by the hypepeople behind it.


I bet you trusted the Blockchain bros and were first in line to buy NFTs too. No?

Why would you trust the hype when you can verify this stuff yourself pretty easily.


Obviously I am calling that promise bullshit...

>But that's exactly the promise of LLMs by the hypepeople behind it.

I do not know and do not care what the "hypepeople" say. I can tell you that, by pure logic alone, LLMs will be superior at simple and routine tasks sooner, which means they will compete with outsourced labor first.

LLMs need to be measured against their competition and their competition right now is outsourced labor. If an LLM can outperform an offshore team at a fraction of the cost, why would any company choose the offshore team? Especially when the LLM eliminates some of the biggest problems with offshore teams (communication barriers, round trip times).

If LLMs take any programmer jobs they will at the very beginning make those outsourced jobs obsolete, so the only relevant question is whether they have done that or are in the process of doing so. If they don't, then their impact will be minimal, if they do, then their impact will be massive. I think that this line of thinking is a far better benchmark then asking whether an LLM gets X or Y question wrong Z% of the time.


> If an LLM can outperform an offshore team at a fraction of the cost,..

And "a few moments later" happens the same as with those "cost effective" clouds.

[1] https://www.heise.de/en/news/IDC-Many-companies-want-partly-...

[2] https://www.idc.com/resource-center/blog/storm-clouds-ahead-... (original)


in the end, it all comes down to roi; if spending x dollars a month brings in an additional 5x revenue then its gonna be worth?

then again, i have some suspicion that alot of consumer-focused end products using llms in the backend (hello chatbots) expecting big returns for all those tokens spent may have some bad news coming... if the bubble starts popping i'm guessing it starts there...


Outsourced devs wielding smart models are even cheaper than onshore and the models lift all boats wrt capability.

The bottleneck will soon be ideas for the things to build.


> The bottleneck will soon be ideas for the things to build

No, it won't. The utility of LLMs is already growing asymptotically now...


>Outsourced devs wielding smart models are even cheaper than onshore

But they do not compete. They have totally different jobs.


The idea they're good for development is propped up a lot by people able to have a react + tailwind site spun up fast. You know what also used to be able to scaffold projects quickly? The old init scripts and generators!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: