I find myself coding a lot with Claude Code.. but then it's very hard to quantify the productivity boost.
The first 80% seem magical, the last ones are painful. I have to basically get the mental model of the codebase in my head no matter what.
I have the issue that I run into some bug that it just cannot fix. Bear in mind I am developing an online game. And then I have to get into the weeds myself which feels such an gargantuan effort after having used the LLM, that I just want to close the IDE and go do something else. Yes, I have used Opus 4.6 and Codex 5.3 and they cannot just solve some issues no matter how I twist it. Might be the language and the fact that it is a game with custom engine and not a react app.
I talked with my coworker today and asked which model he uses, he said Opus 4.6 but he said he doesn't use any AI stuff much anymore since he felt it makes him not learn and build the mental model which I tend to agree a bit with.
I get this at least once a week. And then once you have to dig in and understand the full mental model it’s not really giving you any uplift anyway.
I will say that doing this for enough months has made my ability to pick up the mental model quickly and to scope how much need to absorb much quicker. It seems possible that with another year you’d become very rapid at this.
We're only not letting go because it's not quite there yet. Once AI is there, someone will let go, and to keep up with everyone else, you'll let go too.
Wait a bit longer and the next thing that's let go after you "let go" is you.
Provide better context to LLMs: more documentation, more skills, better Claude files, more ways to harness (tests, compilers, access to artifacts etc).
Why? Isn’t documentation just approximation of the code and therefore less informative for inference than the code itself?
I understand that the code doesn’t contain the architectural intent, but if the LLM writing it can’t provide that then it will never replace the architect.
Of course an LLM can make a thorough design analysis and extract architectural patterns.
But it doesn't have infinite memory and context.
On top of that, it may recognize patterns, but not their intent and scope.
Documentation is gold for humans and LLMs. But LLMs have been the very first major moment in this field that has very little, to no, engineering practices to focus on documentation and specs.
Its about the mental model of the codebase, mentioned by the GP.
Somehow my experience is that no matter how much documentation or context there is, eventually the model will do the wrong thing because it won't be able to figure out something that makes sense in context of the design direction, even if it's painstakingly documented. So eventually the hardest work - that of understanding everything down to the smallest detail - will have to be done anyway.
And if all it was missing was more documentation... Then the agent should have been able to generate that as the first step. But somehow it can't do it in a way that helps it suceed at the task.
> I have to basically get the mental model of the codebase in my head no matter what.
Ah yes, I feel this too! And that's much harder with someone else's code than with my own.
I unleashed Google's Jules on my toy project recently. I try to review the changes, amend the commits to get rid of the worst, and generally try to supervise the process. But still, it feels like the project is no longer mine.
Yes, Jules implemented in 10 minutes what would've taken me a week (trigonometry to determine the right focal point and length given my scene). And I guess it is the right trigonometry, because it works. But I fear going near it.
ah, but you can always just ask the LLM questions about how it works. it's much easier to understand complex code these days than before. and also much easier to not take the time to do it and just race to the next feature
Indeed. But Jules is not really questions-based (it likes to achieve stuff!) and the free version of Codeium is terrible and does not understand a thing. I think I'll have to get into agentic coding, but I've been avoiding it for the time being (I rather like my computer and don't want it to execute completely random things).
Plus, I like the model of Jules running in a completely isolated way: I don't have to care about it messing up my computer, and I can spin up as many simultaneous Juleses as I like without a fear of interference.
This is my experience, which is why I stopped altogether.
I think I'm better off developing a broad knowledge of design patterns and learning the codebases I work with in intricate, painstaking detail as opposed to trying to "go fast" with LLMs.
It's the evergreen tradeoff between the short and long terms. Do I get the nugget of information I need right now but lose in a month, or do I spend the time and energy that leads to deeper understanding and years-long retention of the knowledge?
There is something about our biology that makes us learn better when we struggle. There are many concepts on this dynamic: generation effect, testing effect, spacing effect, desirable difficulties, productive failure...it all converges on the same phenomenon where the easier it is to learn, the worse we learn.
Take K-12 for instance. As computing technology is further and further integrated into education, cognitive performance decreases in a near-linear relationship. Gen Z is famously the first generation to perform worse in every cognitive measure than previous generations, for as long as we've been recording since the 19th century. An uncomfortable truth emerging from studies on electronics usage in schools is that it isn't just the phones driving this. It's more so the Duolingo effect of software overall emulating the sensation of learning without actually changing the brain state. Because the software that actually challenges you is not as engaging or enjoyable.
How you learn, and your ability to parse, infer, and derive meaning from large bodies of information, is increasingly a differentiator in both the personal and professional worlds. It's even more so the case when many of your peers are now learning through LLM-generated summaries averaging just 300 words, perhaps skimming outputs around 1,000 words in length for "important information". The immediate benefits are obvious, but the cost of outsourcing that cognitive work gets lost in the convenience.
Because remember, this isn't just about your ability to recall specific regex, follow a syntax convention, or how much code you ship in an hour. Your brain needs exercise, and deep learning is one of the most reliable ways to get it. Doubly true if you're not even writing your own class names.
What I am speaking to is not far away or hypothetical, either. Because as of 2023, one in four young adults in the United States is functionally illiterate.
Effective learning and memorizing is actually at the narrow edge of struggling: it's neither "too easy" nor "too hard and painful". SRS systems do a very good job of tuning this: by the time a question comes back to you it will feel difficult, but you'll be able to recall the information and answer it with some effort. It's a matter of recognizing this feeling and acknowledging as "the right kind of effort" as opposed to a hopeless task.
If you ask the AI "please quiz me about the proper understanding of issues x y z and tell me if I got it all right. iterate for anything I get seriously wrong, then provide a summary at the end and generate SRS cards for me to train on" it will generally do a remarkably good job at that.
I agree and to address this I’ve tried using them to understand large code bases, I haven’t worked out how to prompt this effectively yet. Has anyone gone this route?