You speak so authoritatively about quality and performance of these models, yet there are no quantitative metrics that correlate to real world outcomes that indicate that the quality and performance of these models is anything but subjective noise and classic benchmark nonsense.
A company consumed half a billion dollars worth of tokens in a month and nobody noticed anything until the bill came due.
Tha $500m dollars is roughly equivalent to 2000 people working for a year or 500 people working for four years, they can and would accomplish a lot if they worked in companies that add value to the economy by solving real problems.
Indeed Its irrelevant. Each firm will make its own cost-benefit analysis, especially since the frontier labs are raising prices.
Marketing only takes you so far in creating noise.
Its weird seeing this focus on bench marks again - PC's did this for quite some time. But in the end it came down to - what does all this additional horsepower let you do? Oh create interesting apps, multi-tasking etc. Which was really the value-add.
> You speak so authoritatively about quality and performance of these models, yet there are no quantitative metrics that correlate to real world outcomes that indicate that the quality and performance of these models is anything but subjective noise and classic benchmark nonsense.
I'm responsible for AI roll out at a small business and we've had data science go over these things internally in terms of what results we get for 12+ months now. Its just my experience that is roughly the results we've seen using Deepseek, etc. and comparing cost/results vs. Anthropic/ChatGPT.
> A company consumed half a billion dollars worth of tokens in a month and nobody noticed anything until the bill came due.
It was sourced from one anonymous source. Its highly unlikely to be true in my view, but hey, you do you.
Your response shows how much trouble the west is in, you are completely missing the fact that this is the leading edge of a petroleum supply shortage that is going to have significant and painful effects on all those who are not prepared.
At this point the United States is the least prepared country in the world and Americans are going to be the hardest hit, simply because we have the most to lose.
Papering over the canary in the coal mine never saved the coal miners.
Why the hell do you people know your IQ? That test is a joke, there’s zero rigor to it. The reason it’s meaningless is exactly that, it’s meaningless and you wasted your time.
Why one would continue to know or talk about the number is a pretty strong indicator of the previous statement.
You're using words like "zero" and "meaningless" in a haphazard way that's obviously wrong if taken literally: there's a non-zero amount of rigour in IQ research, and we know that it correlates (very loosely) with everything from income to marriage rate so it's clearly not meaningless either.
The specifics of an IQ test aren't super meaningful by itself (that is, a 150 vs a 142 or 157 is not necessarily meaningful), but evaluations that correlate to the IQ correlate to better performance.
Because of perceived illegal biases, these evaluations are no longer used in most cases, so we tend to use undergraduate education as a proxy. Places that are exempt from these considerations continue to make successful use of it.
This isn't the actual issue with them, the actual issue is "correlation is not causation". IQ is a normal distribution by definition, but there's no reason to believe the underlying structure is normal.
If some people in the test population got 0s because the test was in English and they didn't speak English, and then everyone else got random results, it'd still correlate with job performance if the job required you to speak English. Wouldn't mean much though.
> we tend to use undergraduate education as a proxy
Neither an IQ test nor your grades as an undergraduate correlate to performance in some other setting at some other time. Life is a crapshoot. Plenty of people in Mensa are struggling and so are those that were at the top of class.
Do you have data to back that up? Are you really trying to claim that there is no difference in outcomes from the average or below average graduate and summa cum laude?
That is moving the goal posts. No one claimed it is the sole predictor. The claim was that there is no relation at all. Your own links say their is a predictive relationship. Of course other factors matter, and may even be more important, but with all else equal, grades are positively correlated.
It’s about trend. Not <Test Result>==Success. These evaluations try to put an objective number to what most of us can evaluate instinctively. They are not perfect or necessarily fair. Many, maybe most, job interviews are really a vibe assessment, so it’s an imperfect thing!
I don’t know my IQ, but I probably would score above average and have undiagnosed ADHD. I scored in the 95th percentile + on most standardized tests in school but tended to have meh grades. I’m great at what I do, but I would be an awful pilot or surgeon.
Growing up, you know a bunch of people. Some are dumb, some are brilliant, some disciplined, some impetuous.
Think back, and more of the smart ones tend to align with professions that require more brainpower. But you probably also know people who weren’t brilliant at math or academics, but they had focus and did really well.
For me it was just a coincidence of MENSA advertising their events in my high school and being pushed by a couple of friends to go through testing and join together.
I guess if you're an outlier you sometimes know, for example the really brilliant kids are often times found out early in childhood and tested. Is it always good for them ? Probably not, but that's a different discussion.
You are right that outside of the massive capex spending on training models, we don't see that much of an economic impact, yet. However, it's very far from zero:
Remember these outsourcing firms that essentially only offer warm bodies that speak English? They are certainly already feeling the impact. (And we see that in labour market statistics for eg the Philippines, where this is/was a big business.)
And this is just one example. You could ask your favourite LLM about a rundown of the major impacts we can already see.
But those warm body that speak English, they offer a service by being warm, and able to sort of be attuned to the distress you feel. A frigging robot solving your unsolvable problem ? You can try, but witness the backlash.
We are mixing up two meanings of the word 'warm' here.
There's no emotional warmth involved in manning a call centre and explicitly being confined to a script and having no power to make your own decisions to help the customer.
'Warm body' is just a term that has nothing to do with emotional warmth. I might just as well have called them 'body shops', even though it's of no consequence that the people involved have actual bodies.
> A frigging robot solving your unsolvable problem ? You can try, but witness the backlash.
Front line call centre workers aren't solving your unsolvable problems, either. Just the opposite.
And why are you talking in the hypothetical? The impact on call centres etc is already visible in the statistics.
This is such a BS response, first just because a job isn’t physically exhausting doesn’t mean it’s not challenging and mentally exhausting.
Second, our job in technology is to make ALL jobs easier, that’s what technology is for, not for bullshit manipulative, addictive and extractive consumer crap. The reason any of it even exists is to improve the productivity of humans.
There will always be demanding jobs, they may be demanding physically, or mentally or both, your god damn job is to figure out how to make every one of those jobs easier and LESS physically and mentally challenging.
Pointing out the obvious fact that using different metrics other jobs are harder is neither helpful, valuable nor unique.
I will however agree with you last statement, technologies abuse of people in the consumer app space is anti-social and destructive to the world, those are “jobs” we created with technology. In a sense you might say we are responsible for creating the worst jobs in the world, because as easy and valueless as being an influencer is, it destroys people mentally and turns people into shells of human beings.
So instead of trying to imply that all your fellow engineers are a bunch of whiny soft and weak complainers, you should be both simultaneously grateful that there are jobs that are physically easy and obligated to help those whose jobs still aren’t easy make as much of their jobs as easy as possible.
We live in a society we are ALL dependent on each other, specialization is what allows us to have large complex societies, without it we would all be trying to find food and build shelter. We can ONLY have our jobs because others do theirs, never forget that, that fact creates an OBLIGATION not a comparison.
Software is the connective tissue of the world, generating mediocre quality results (which will be the best outcome if you don’t really understand what you are looking at) is not just lazy it can be dangerous, do the worlds best engineers make mistakes? Of course they do, but that’s why building high quality software is collaborative process you have to work with others to build better systems. If you aren’t, you are wasting your time.
As of now (and this could change, but that doesn’t change the moral and ethical obligations), software engineers are richly rewarded specifically because they should be able to write and understand high quality code, the code written is the foundation of how our entire modern world is built.
"This stuff" is by far the most-discussed stuff on HN in the last couple months. Nothing else comes close.
Below, I've pasted a partial list. That's restricted to just muskdogeness and only the biggest threads. There are 25,000 comments in those threads alone.
How this translates into "the threads are silenced" and "people give up on commenting" is left as an exercise to the reader.
Yup, they're permaflagged and HN knowingly lets them go by even though they could do differently. Case in point: https://news.ycombinator.com/item?id=43208973. Seems exceedingly likely that this one too would've gotten auto-perma-flagged, but then a button was pushed to undo this as a one-off that is not applied to all these other cases. Of course @dang is free to prove us wrong and explain the discrepancy.
That's an underestimate though, since the list is far from comprehensive. Let's conservatively bump the number up to 40k comments. That's over 1000 comments a day and well over 10% of the total comments that have been posted to HN during this period.
From my perspective that's not the same as "letting it go by".
Hmm, that's interesting dang. I'm almost never seeing any 'muskdogeness' posts (nice name !) on the first page. I'm looking at new posts, with showdead = True.
Would you have a stat on all those posts, if they were flagged at some point, and then un-flagged later ?
I looked at the frontpage time of those 50 threads and it adds up to almost exactly 300 hours on the front page. That's a lot of hours.
But it was over a total time span of 900 hours, so still pretty easy to miss all 50 threads. This is the way the HN frontpage works: no one sees everything that makes the front page (not even us), and it's entirely possible to miss the largest threads and most-discussed topics.
For example, https://news.ycombinator.com/item?id=43208973 is by now the second-largest thread in HN's history and spent 16 hours on the front page, but there are still going to be thousands of regular HN readers who never saw it, and some of those will probably feel angry about that and say that it has been "silenced" and "censored" and so on. That's the way this works.
Ultimately, it works that way because of fundamentals, meaning there's not much we can do about it. The solution is not to have 100 threads with 80k comments on the frontpage for 600 hours, instead of 50 threads with 40k comments for 300 hours, even though that's probably what most people who feel frustrated probably think they want. Rather the solution is to articulate the principles by which HN operates, and keep sticking to those principles over time.
I've been posting a ton about that in recent weeks, although (by the same dynamic I just described!) many readers won't yet have seen any of those posts. Here are two entry points:
He's acting like this because he campaigned on "government bad". Not any specific bit of it: all of it.
It is often theorized that he's doing this on behalf of some foreign power. That seems unlikely, and more to the point, uninformative. He was democratically elected: a plurality of voters wanted this outcome. He still has wide support among them.
It is true that foreign powers have been spreading propaganda in his favor, and the result is a weakening of the country. But that's not the result of one person betraying the nation. It's the result of American beliefs about what their nation is and should be.
The term for them is useful idiots not assets. The democrats did enormous damage to the Ukrainian cause pushing the russia hoax in 2017/18. Steele assumed there was competent intelligence analysts to analyse what he collected but instead it was politicians.
A company consumed half a billion dollars worth of tokens in a month and nobody noticed anything until the bill came due.
Tha $500m dollars is roughly equivalent to 2000 people working for a year or 500 people working for four years, they can and would accomplish a lot if they worked in companies that add value to the economy by solving real problems.
reply