kind of disagree here. on the surface this makes sense, but this isn't "Adobe Pro vs Freemium version" where some tiny vertical slice of your business can be made slightly more efficient with a b2b enterprise plan. this is generalized intelligence and literally everybody can benefit from it in an immeasurable number of ways. i would go as far as to actually compare it more to water or air than a tool.
if only the hyper wealthy can access the pure water that doesn't give you cancer while the rest of us drink from the Ganges river/sub-100iq models that drool and hallucinate/waste time, then I would say that's pretty terrible for the world. it'll just create extreme disparity in our world, far far worse than anything that exists today.
and you may think, man what a ridiculous example, but think about it this way: what happens when something like Mythos or some future model can actually solve your specific cancer (we're getting closer and closer), but is entirely impossible to afford? Or perhaps you need boosters that require the AI to create more of, and now you're reliant on a model that is too expensive.
I’m entirely in agreement with this POV, but I’m also copacetic about it:
You could have said much the same about computers in the world dominated by IBM mainframes 60 years ago. Now we have vastly more powerful computers on our wrists (or our pacemakers!), let alone in our pockets or on our desks.
As far as my understanding goes the bottleneck for what you are talking about is hardware not software, so open source won't help that much for the foreseeable future.
> and you may think, man what a ridiculous example, but think about it this way: what happens when something like Mythos or some future model can actually solve your specific cancer (we're getting closer and closer), but is entirely impossible to afford? Or perhaps you need boosters that require the AI to create more of, and now you're reliant on a model that is too expensive.
Isn't that already the case with current care? Wealthy people get a standard of care poor people couldn't even dream of. Rich people live, temporarily embarrassed millionaires die.
it's really sad. the internet is slowly dying (along with humans using their brains, it seems). On twitter, any time anything interesting is posted (which also is usually "enhanced" with AI) 80% of the responses are all very clearly just claude/gpt replies. what is the fucking point? whats the end game for these accounts?? i HATE that i have to sit there parsing through all of this cruft. Fuck this whole article could've been so much shorter, and more concise if a human just sat there, timeboxed themselves, and wrote!
i think that's a chicken and egg cultural problem. build cities in a way where bicycles/walking is encouraged, then over time you'll have people that want to do exactly that.
> And 2 years is probably pretty average for the whole tech industry.
maybe for a fungible CRUD engineer. I think Karpathy is in a different league and I'm certainly surprised to hear this fact. I would expect someone like him to sit within a certain lab for a long time
He's an extraordinarily bright guy. He can get a lot more done in two years than most people, and he can get up to speed with a new organization and a new task and be productive much faster than most people.
My impression with no inside knowledge, but understanding what Elon companies are like, is that he was assigned essentially an impossible task at Tesla and tried his very best, but it could not be done, and he semi-burned out. It makes sense for him to be getting back on the horse now.
The Elon approach to management as I see it is to assign what normally would be totally unreasonable goals to a small group of extremely bright people, and they work their asses off and somehow find a way. Sometimes this works, and sometimes it doesn't. If it works and the impossible was in fact, just barely possible, you dominate the market, everyone gets rich, and the people see it as the most exciting, intense, and rewarding part of their career. If it doesn't, they get depressed, divorced, and looking for other work. The Elon magic is threading the needle closely enough that a lot of the seemingly impossible things are in fact possible with enough hard work and brainpower, but although Elon is extremely good at this, the nature of the thing is that you can't predict which side you'll wind up on fully accurately.
That seems like the opposite. Why would someone with high market value stay in one place? 2 years is basically optimal - you vest 50%, maybe collect a promotion, do some good work and learn a lot, and then get to move on for another solid bump/ promotion and a new set of stocks.
I expect the people with low market value to be the ones sticking around labs for long periods of time, they don't have the option to move and they aren't getting poached.
same. maybe it just depends on the bank, but i can't imagine why that would matter at all. they have the whole picture of your financial history, generally. what does it matter whether that one bank account has only enough in it to pay off the loan every month.
The longer I am alive the more I realize that power is all that matters, and that rules are nice but only for the peons. "Acceptable" in this case means pretty much nothing and is a word that is philosophic in its meaning. You can yell into the clouds that something is unacceptable or unfair and it may be true in some ethical/moral sense, but it matters none. Power will always win out and if someone came to the WH and did the same thing, then there would only be one reason for it -- that there is somebody more powerful than the US and is able to get away with things like this. The masses would scream, cry and maybe some would be happy, but it wouldn't matter whatsoever. Maduro might have been bad (a great excuse for the masses to avoid revolts) but ultimately, the government made a decision to do it and that's that.
I am not a fan of "well what can ya do?" That's not how we got the 40 hour work week or civil rights legislation. That's not how women got the right to vote. You have to fight and fight and fight for a better world. I mean that.
It's literally how you got those things. Without leverage to get them, they would have just been complaints. You ask what you can do, and then you do it.
I meant more in the sarcastic/defeatist sense. A linguistic shrug not to be taken in the literal sense. That's on me though, I should've picked better wording.
that's how i've felt about all AI design. the harnesses get better and cooler, and the outputs up the baseline of utter crap to "whoa that doesn't look bad at all!" which works for probably 90% of the web, but anything truly unique still requires a lot of human taste. maybe that will change one day, but I hope it doesn't.
> Task: Scan `sys/rpc/rpcsec_gss/svc_rpcsec_gss.c` for
> concrete, evidence-backed vulnerabilities. Report only real
> issues in the target file.
> Assigned chunk 30 of 42: `svc_rpc_gss_validate`.
> Focus on lines 1158-1215.
> You may inspect any repository file to confirm or refute
behavior."
I truly don't understand how this is a reproduction if you literally point to look for bugs within certain lines within a certain file. Disingenuous. What's the value of this test? I feel like these blog posts all have the opposite of their intent, Mythos impresses me more and more with each one of these posts.
> I truly don't understand how this is a reproduction if you literally point to look for bugs within certain lines within a certain file. Disingenuous.
You missed this part:
> For transparency, the Focus on lines ... instructions in our detection prompts were not line ranges we chose manually after inspecting the code. They were outputs of a prior agent step.
We used a two-step workflow for these file-level reviews:
Planning step. We ran the same model under test with a planning prompt along the lines of "Plan how to find issues in the file, split it into chunks." The output of that step was a chunking plan for the target file.
Detection step. For each chunk proposed by the planning step, we spawned a separate detection agent. That agent received instructions like Focus on lines ... for its assigned range and then investigated that slice while still being able to inspect other repository files to confirm or refute behavior.
That means the line ranges shown in the prompt excerpts were downstream artifacts of the agent's own planning step, not hand-picked slices chosen by us. We want to be explicit about that because the chunking strategy shapes what each detection agent sees, and we do not want to present the workflow as more manually curated than it was.
What's the problem of walking the entire repo having one file at a time be the entry point for the context of an agent with tools available to run the code and poke around in the repo?
because some vulnerabilities are complex combinations of ideas and simply ingesting one file at a time isn't enough. and then the question is, well how many files, and which? and when trying to solve for that problem, then you're basically asking something intelligent on how to find a vulnerability
yeah but i think my point is that you need an intelligent model to combine the files in such a way that you could give the proper context for a cheaper/dumber model to potentially find exploits. if you have dumber models doing this, wouldn't you have a borderline infinite combination of ways to setup context before you end up finding something?
if only the hyper wealthy can access the pure water that doesn't give you cancer while the rest of us drink from the Ganges river/sub-100iq models that drool and hallucinate/waste time, then I would say that's pretty terrible for the world. it'll just create extreme disparity in our world, far far worse than anything that exists today.
and you may think, man what a ridiculous example, but think about it this way: what happens when something like Mythos or some future model can actually solve your specific cancer (we're getting closer and closer), but is entirely impossible to afford? Or perhaps you need boosters that require the AI to create more of, and now you're reliant on a model that is too expensive.
Open source needs to save us all from this
reply