Hacker Newsnew | past | comments | ask | show | jobs | submit | ivoras's commentslogin

I did a similar thing but with backend-heavy code, and I agree with this assessment:

> In particular, I asked ChatGPT to write a function by knowing precisely how I would have implemented it. This is crucial since without knowing the expected result and what every line does, I might end up with a wrong implementation.

In my eyes, it makes the whole idea of AI coding moot. If I need to explain every step in detail - and it does not "understand" what it's doing; I can virtually the statistical trial-and-error behind its action - then what's the point? I might as well write it all myself and be a bit more sure the code ends up how I like it.

link: https://www.linkedin.com/feed/update/urn:li:activity:7289241...


Because as op pointed out it's faster. It has ready access to the correct usage of different libraries.


So does the language features of basically any modern editor.


Not really - there's a difference between having the docstring of a function available for you to read, and a model which has learned from thousands of examples how to use a particular API and integrate it into a larger set of instructions. The latter is vastly faster and takes much less human work than the former.


Except when it consistently gets said particular API wrong. I was using it to do basic graphql-yoga setup with R1 and then Claude Sonnet 3.5 and they both output incorrect usage, and got stuck in a loop trying to fix it.

If it can’t do something that basic and that common using a language and toolset with that much training data, then I’m pessimistic personally.

I’m yet to see Copilot be useful for any of my juniors when we pair, it gets in the way far more than it helps and it is ruining their deeper understanding, it seems.

I’ll continue trying to use these tools, but I swear you’re overselling their abilities even still.


The way to fix that is to find an example of correct usage of that API and paste that example in at the start of the prompt.

This technique can reliably make any good LLM fluent in an API that it's never seen in its training data.


At this point with Cursor you can have it index the online docs by giving it a base URL and have it automatically RAG the relevant content into the chat (using the @ symbol to reference the docs). Both Windsurf and Cursor also support reading from URLs (iirc Aider does too).

I’ve had better luck with manually including the page but including the indexed docs is usually enough to fix API mistakes.


Begs the question again: if you need to go out of your way to find an example of correct usage of the api to paste into the prompt, why are you even bothering?

I find copilot useful when I already know what I want and start typing it out, at a certain point the scope of the problem is narrowed sufficiently for the LLM to fill the rest in. Of course this is more in line of “glorified autocomplete” than “replacing junior devs” that a keep hearing claims of.


"if you need to go out of your way to find an example of correct usage of the api to paste into the prompt, why are you even bothering?"

Because it's faster.

Here's an example: https://tools.simonwillison.net/ocr

That's an entirely client-side web page you can use to open a PDF which then converts every page to an image (using PDF.js), then runs each image through the Tesseract.js OCR program and lets you copy out the resulting text.

I built the first version of that in about 5 minutes while paying attention to a talk at a conference, by pasting in examples of PDF.js and Tesseract.js usage. Here's that transcript: https://gist.github.com/simonw/6a9f077bf8db616e44893a24ae1d3...

I wrote more about that process here, including the prompts I used: https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/

That's why I'm bothering: I can produce useful software in just a few minutes, while only paying partial attention to what the LLM is doing for me.


That's a nice little self contained example. I have yet to see this approach work for the day job: a larger codebase with complex inter-dependencies, where the solution isn't so easily worded (make the text box pink) and where the resulting code is reviewed and tested by one's peers.

We actually had to make a rule at work that if you use an LLM to create an PR and can't explain the changes without using more LLMs, you can't submit the PR. I've seen it almost work - code that looks right but does a bunch of unnecessary stuff, and then it required a real person (me) to clean it up and ends up taking just as much time as if it were just written correctly the first time.


That's one of my personal rules for LLM usage too: "Don't commit code you couldn't explain to someone else" - https://simonwillison.net/2024/Jul/14/pycon/#pycon-2024.062....


It's faster if all you're concerned with can fit in a static html file but what about for more complex projects?

I've struggled with getting any productivity benefits beyond single-file contexts. I've started playing with aider in an attempt to handle more complex workflows and multi-file editing but keep running into snags and end up spinning my wheels fighting my tools instead of making forward progress...


Because it still takes 5 mins for it to output the minimum viable change whereas it’d take me an hour


Yeah thats the trick I've been using too, but by that point I get a better result by implementing it myself... of course, I've had two decades of practice and I don't have to communicate what I want lossily to myself, so it's an unfair comparison, but perhaps I've just not found the right use-case yet. I'm sure it exists, I've just not had much luck over the past couple years yet (including just this past weekend).


That is far more likely to happen when it is relying on compressed knowledge of documentation and usage for an API it would have seen (comparatively) only a few times in training. That is where the various types of memory, tool calling and supplementary materials being fed in can make them significantly more situationally useful.

The LLMs you mention are first and foremost a “general knowledge” machine rather than a domain expert. In my opinion, Junior developers are the least likely to benefit from their use because they have neither the foundational understanding to know when the approach is wrong, nor the practical experience to correct any mistakes. An LLM can replace a junior dev because we expect the mistakes and potentially poor quality, but you don’t really want a junior developer doing code reviews for another junior developer before pushing code.


The expectation for junior devs will probably change as well and they’d do a lot more shadowing while learning the product. Experience is gained in time.


LLMs are way, way faster at typing code than I am. I often dictate to them exactly what I need and it saves me a bunch of time.


So you would say typing speed matters?


Yes. Here's a neat essay that helps capture why I believe that. https://www.scattered-thoughts.net/writing/speed-matters/


Sometimes it does. If typing slows you down enough to take you out of the zone then it probably does.


and, for tasks that take the LLM a minute, you can grab some coffee while it works and come back just to review. It's a great feeling.


There are tons of use cases. E.g. if you know an algorithm (take any pseudocode description from a moderately complex algorithm on Wikipedia for example) and you know the programming language, you still may be looking at an hour or two of typing just to get that pseudocode down into code using your own language, variable names, libraries.

But this is the kind of thing a LLM excels at. It gives you 200 lines of impl right away, you have a good understanding of both what it should look like and how it should work.

Slow and error prone to type but, but quick and easy to verify once done, that's the key use case for me.


It’s a better, faster, personalized Stack Overflow. Just like SO you might be led down the wrong path by an answer, but if you’re a programmer and you say you don’t get value out of Stack Overflow I don’t believe you.


Speaking of keyboard shortcuts, I miss BSD's Ctrl-T and SIGINFO. It often helped to see if a process was hung.


I don't know exactly what these BSD things did, but there is a super easy way nowadays to get the stack for any process:

    eu-stack -i -p $(pidof ...)
Thanks to debuginfod this will even give you good backtraces right away (at the cost of some initial delay to load the data from the web, consecutive runs are fast). If you get a "permission denied" error, you probably need to tweak kernel.yama.ptrace_scope=0


the bsd things still work; you can install a bsd in qemu or a spare laptop and try them

from your reference to kernel.yama.ptrace_scope (and your apparent belief that bsd belongs to the distant past) i infer that eu-stack is a linux thing? this looks pretty awesome, thanks for the tip!

https://stackoverflow.com/questions/12394935/getting-stacktr...


Oh, that's easy: "founder mode" means the founder is hyperfocused (or in the oldspeak, obsessed) on his work (with his company) and optimizes (micromanages) everything. We know that already. Perfection demands micromanagement. Steve Jobs was actually doing that - reaching down the ranks and directly helping (molesting) people doing some piece of a larger product he particularly cared about.

It also often causes (or is caused by) eccentric behavior (or mental issues) - but it's been done since forever, and when it's successful, we call it "visionary." When it's not, we call it "toxic."


Musk is a good example of this. Amazing visionary; also extremely toxic.


Does it support multiple brokers within the same projects? As in can I use both Kafka and RabbitMQ with it?


It is an interesting feature request, I create an issue that you can follow the progress:

https://github.com/airtai/faststream/issues/758


Not natively, but you have a tools to run multiple brokers manually (run them together at lifespan hooks). In the nearest future we will add this feature at the framework core.


It's curious to see how times have changed. Soft-updates are indeed a very clever solution to the problem of file system consistency in the face of possible failures like OS crashes or power outages.

While journaling "simply" writes a journal of FS ops to a continuous area of the drive (especially important for mechanical drives) which is fsynced faster than random writes across platters, Soft-updates opts to be really clever with the way FS ops ordered, so that what's actually on the drive is always consistent, even with decent amount of write caching. It doesn't guard file content, though, just the file system itself.

Soft-updates is what enabled the BSD's to support short-lived files never touching the physical drive. You could create a file, write to it, read it, close and delete it, and if this was done in a reasonably short amount of time, no writes whatsoever got to the actual hardware. It was wonderful with software which used to generate a lot of temp files, like building C software.

OTOH, if a write got trough to the hard drive, Soft-updates guaranteed that file system structures get written in a way so that if an OS crash or a power failure happened at any single time, the only downside could be some unreferenced blocks, which could be garbage collected later; assuming hardware doesn't lie about fsync, of course...

I think ext4 supports this kind of short-lived-files-never-touch-the-drive caching.


Hammer2

There is a port of Hammer2 being worked on for OpenBSD/NetBSD/FreeBSD/Linux.

I can only hope the developer delivers this to OpenBSD and someone then maintains it since the developer, based on the readme, doesn’t appear to want to maintain OpenBSD or NetBSS longterm.

https://github.com/kusumi/openbsd_hammer2


> the only downside could be some unreferenced blocks, which could be garbage collected later

Unless I've misunderstood,

> It doesn't guard file content, though, just the file system itself.

Is a pretty big downside, although I grant that data integrity in the face of of sudden crash/poweroff is hard without going full-CoW


Doesn't plain old Duplicity (https://duplicity.us/) do that already? (except for de-duplication)


Yeah, but...

- Row-level anything introduces write alignment and fsync alignment problems; pages are easier to align than arbitrary-sized rows

- PostgreSQL is very conservative (maybe extremely) conservative about data safety (mostly achieved via fsync-ing at the right times), and that propagates through the IO stack, including SSD firmware, to cause slowdowns

- MVCC is very nice for concurrent access - the Oriole doc doesn't say with what concurrency are the graphs achieved

- The title of the Oriole doc and its intro text center about solving VACUUM, which is of course a good goal, but I don't think they show that the "square wave" graphs they achieve for PostgreSQL are really in majority caused by VACUUM. Other benchmarks, like Percona's (https://www.percona.com/blog/evaluating-checkpointing-in-pos...) don't yield this very distinctive square wave pattern.

I'm sure the authors are aware of these issues, so maybe they will write an overview of how they approached them.


> - Row-level anything introduces write alignment and fsync alignment problems; pages are easier to align than arbitrary-sized rows

OrioleDB uses row-level WAL, but still uses pages. The row-level WAL becomes possible thanks to copy-on-write checkpoints, providing structurally consistent images of B-tree. Check the architecture docs for details. https://github.com/orioledb/orioledb/blob/main/doc/arch.md

> - PostgreSQL is very conservative (maybe extremely) conservative about data safety (mostly achieved via fsync-ing at the right times), and that propagates through the IO stack, including SSD firmware, to cause slowdowns

This is why our first goal is to become pure extension. Becoming part of PostgreSQL would require test of time.

> - MVCC is very nice for concurrent access - the Oriole doc doesn't say with what concurrency are the graphs achieved

Good catch. I've added information about VM type and concurrency to the blog post.

> - The title of the Oriole doc and its intro text center about solving VACUUM, which is of course a good goal, but I don't think they show that the "square wave" graphs they achieve for PostgreSQL are really in majority caused by VACUUM. Other benchmarks, like Percona's (https://www.percona.com/blog/evaluating-checkpointing-in-pos...) don't yield this very distinctive square wave pattern.

Yes, it's true. The square patters is because of checkpointing. The reason of improvements here is actually not VACUUM, but modification of relevant indexes only (and row-level WAL, which decreases overall IO).


Does the generational reference approach do something similar to MVCC in databases (e.g. PostgreSQL)?


Heh, doesn't fully support the furlong-firking-forthinght system :(

https://en.wikipedia.org/wiki/FFF_system


*fortnight*


No love for elbows either.


Or is this another instance of "UFOs appear when we are close to a nuclear crysis"?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: