Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You wouldn’t pay a human to write 100k LOC. Or at least you shouldn’t. You’d pay a human to write a working useful compiler that isn’t riddled with copyright issues.

If you didn’t care about copying code, usefulness, or correctness you could probably get a human to whip you up a C compiler for a lot less than $20k.





Are you trolling me? Companies (made of humans) write 100,000 LOC all the time.

And it's really expensive, despite your suspicions.


No, companies don’t pay people to write 100k LOC. They pay people to write useful software.

We figured out that LOC was a useless productivity metric in the 80s.


[flagged]


I can't stress enough how much LOC is not a measure of anything.

Yep. I’ve seen people copy 100’s of lines instead of adding a if statement.

In fact it is. And can be useful. IF you have quality controls in place, so the code has a reasonable quality, the LOC will correlate with amount of functionality and/or complexity. Is a good metric? No. Can be used just like that to compare arbitrary code bases, absolutely no!

As a seasoned manager, I have an idea how long a feature should take, both in implementing effort and longness of code. I hace to know it, is my everyday work.


OK, well, the people in MY software industry use LOC as an informal measure of complexity.

LIKE THE WHOLE WORLD DOES.

But hey, maybe it's just the extremely high profile projects I've worked on.


As an informal measure of the complexity of the code sure 100k lines are inherently more complex than 10k because there’s just more there to look at. And if you are assuming that 2 projects were made by competent teams, saying that one application is 10k LOC and one is 1 million might be useful as a heuristic for number of man hours spent.

But I can write a 100k LOC compiler where 90k lines are for making error messages look pixel perfect on 10 different operating systems. Or where 90k lines are useless layers upon layers of indirection. That doesn’t mean that someone is willing to pay more for it.

AI frequently does exactly that kind of thing.

So saying my AI made a 100k LOC program that does X, and then comparing the cost to a 100k LOC program written by a human is a nonsense comparison. The only thing that matters is to compare it to how much a company would pay a human to produce a program capable of the same output.

In this case the program is commercially useless. Literally of zero monetary value, so no company would pay any money for it. Therefore there’s nothing to compare it to.

That’s not to say it’s not an interesting and useful experiment. Or that things can’t be different in the future.


Such as?

Without questioning the LOC metric itself, I'll propose a different problem: LOC for human and AI projects are not necessarily comparable for judging their complexity.

For a human, writing 100k LOC to do something that might only really need 15k would be a bit surprising and unexpected - a human would probably reconsider what they were doing well before they typed 100k LOC. Where-as, an AI doesn't necessarily have that concern - it can just keep generating code and doesn't care how long it will take so it doesn't have the same practical pressure to produce concise code.

The result is that while for large enough human-written programs there's probably an average "density" they reach in relation of LOC vs. complexity of the original problem, AI-generated programs probably average out at an entirely different "density" number.


Your first post specifically stated:

"I'm curious - do you have ANY idea what it costs to have humans write 100,000 lines of code???"

which any reasonable reading would take to mean "paid-by-line", which we all know doesn't happen. Otherwise, I could type out 30,000 lines of gibberish and take my fat paycheck.


> you could probably get a human to whip you up a C compiler for a lot less than $20k

I fork Clang or GCC and rename it. I'll take only $10k.


My question, which I didn’t still find anybody asking: how many compilers, including but not limited to the 2 most famous, were in the training set.

Certainly tcc. Probably also rui314's chibicc as it's relatively popular. sdcc is likely in there as well. Among numerous others that are either proprietary or not as well known.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: