More

jsight · 2025-12-20T04:54:32 1766206472

Yeah, I had the same experience. It made the page basically unusable for me.

jsight · 2025-12-17T05:30:22 1765949422

I spent a little bit of time poking at Gemini to see what it thought the accident rate in an urban area like Austin would be, including unreported minor cases. It estimated 2-3/100k miles. This is still lower than the extrapolation in the article, but maybe not notably lower.

We need far higher quality data than this to reach meaningful conclusions. Implying conclusions based upon this extrapolation is irresponsible.

mmooss · 2025-12-17T06:43:13 1765953793

I don't understand how Gemini's fabrication has any validity. What is it based on?

jsight · 2025-12-17T16:43:17 1765989797

It is at least as reliable as the data in the electrek article. My point is that the data naturally has error margins that are clearly large enough to make drawing concrete conclusions impossible.

mmooss · 2025-12-17T21:00:50 1766005250

> It is at least as reliable as the data in the electrek article.

I don't see why you say that.

> My point is that the data naturally has error margins that are clearly large enough to make drawing concrete conclusions impossible.

I don't understand this one either.

jsight · 2025-12-17T05:24:42 1765949082

Somewhat amusingly, the human rate should also be filtered based upon conditions. For years people have criticized Tesla for not adjusting for conditions with their AP safety report, but this analysis makes the same class of mistake.

1/500k miles that includes the interstate will be very different from the rate for an urban environment.

jsight · 2025-12-17T05:22:36 1765948956

Yeah, I'm glad that they are trying to do a rate, the problem is that the numerator in the human case is likely far larger than what they are indicating.

Of the Tesla accidents, five of them involved either collisions with fixed objects, animals, or a non-injured cyclist. Extremely minor versions of these with human drivers often go unreported.

Unfortunately, without the details, this comparison will end up being a comparison between two rates with very different measurement approaches.

phyzome · 2025-12-17T18:35:16 1765996516

With Tesla redacting as much as they are, I think we have to assume the worst.

jsight · 2025-12-15T01:38:26 1765762706

Yeah, this company went through an amazingly bad period. They quite innovating, and also worked really hard to segment their products in a way that would extract every last $ out of the consumer. "Oh you want it not to run into things? You'll need one more step up for another $100-200" It wasn't really based on the hardware, so much as the intentional limitations of the software.

Meanwhile cheap roborocks had no arbitrary limitations and more honest marketing.

I miss the optimism that this company used to have, but I won't miss the entity that they became.

jsight · 2025-12-14T22:53:17 1765752797

I remember hearing Karpathy refer to these outages as a worldwide "intelligence brownout".

Crazy: https://www.youtube.com/shorts/SV4DMqAJ8RQ

jsight · 2025-12-11T16:29:08 1765470548

I suspect that people not really paying for certain things has had an impact. Remember when there were a lot of high quality, paid keyboards for Android?

I doubt those were particularly profitable, but there was a lot of innovation back then.

crote · 2025-12-11T16:50:08 1765471808

Why pay for a keyboard app when the default keyboard is already good enough?

Moreover, why risk installing a 3rd-party keyboard app when the App Store is filled with adware and malware? All those handy flashlight and camera apps are a Trojan's Horse, why should one assume that the various keyboard apps in the App Store aren't keyloggers trying to steal my login info?

In 2025 I can do mostly error-free blind typing on the Pixel 7 keyboard, with all autocorrect and predictive spelling intentionally turned off. Why would I need innovation?

dpoloncsak · 2025-12-11T18:28:23 1765477703

>why should one assume that the various keyboard apps in the App Store aren't keyloggers trying to steal my login info?

Honestly, you shouldn't.

Theoretically, Apple + Google take a % of all payments that go through their store, with the expressed reason being to "monitor and police the safety of the apps on the app store". You really should be able to trust apps on the official app stores, but I don't trust Apple or Google, so the whole system is moot I guess

lotsofpulp · 2025-12-11T17:17:01 1765473421

>Moreover, why risk installing a 3rd-party keyboard app when the App Store is filled with adware and malware? All those handy flashlight and camera apps are a Trojan's Horse, why should one assume that the various keyboard apps in the App Store aren't keyloggers trying to steal my login info?

And unless the app gets acquired by the big companies, it will eventually turn into malware.

jsight · 2025-12-14T05:45:43 1765691143

> Why pay for a keyboard app when the default keyboard is already good enough?

That's probably what people would have said before Swype was invented too. But lots of people use that in their default keyboards thanks to the people that _did_ pay for keyboards back then.

Who knows what innovations we are missing out on today just because we've consolidated things down to 2-3 suppliers?

tasuki · 2025-12-11T17:25:43 1765473943

> Why pay for a keyboard app when the default keyboard is already good enough?

I'd pay for an actually good keyboard. I find the default keyboard (GBoard) atrocious for languages other than English.

jsight · 2025-12-10T05:28:23 1765344503

Honestly, I think this is a valid viewpoint, but perhaps C is too low level. The bottleneck in generating code with LLMs tends to happen at the validation step. Using a language that has a lot of hard to validate footguns isn't great.

While I am not a big fan of Rust, the philosophy is likely useful here. Perhaps something like it, with a lot of technical validation pushed to the compiler, could actually be really useful here.

Getting rid of the garbage collector with no major increase in human cognitive load might actually be a big win.

jsight · 2025-12-09T18:59:02 1765306742

I feel like this is an artifact of some limitations in the training process for modern LLMS. They rarely get enough training to know when to stop and ask questions.

Instead, they either blindly follow or quietly rebel.

ineedasername · 2025-12-09T20:20:54 1765311654

There was a huge over correction somewhere around the beginning of 2025, maybe February or so, with ChatGPT. Prior to that point, I had to give a directive in the user config prompt to “don’t tell me something isn’t possible or practical, assume it is within your capabilities and attempt to find a way. I will let you know when to stop”. Because it was constantly hallucinating that it couldn’t do things, like “I don’t have access to a programming environment”. When I wanted it to test code itself before I did. Meanwhile one tab over it would spin up a REPL and re-paste some csv into python and pandas without being asked.

Frustrating, but “over correction” is a pretty bad euphemism for whatever half assed bit of RLHF lobotomy OpenAI did that, just a few months later, had ChatGPT doing a lean-in to a vulnerable kid’s pain and actively discourage an act that might have saved his life by signaling more warning signs to his parents.

It wasn’t long before that happened, after the python REPL confusion had resolved, that I found myself typing to it, even after having to back out of that user customization prompt, “set a memory that this type of response to a user in the wrong frame of mind is incredibly dangerous”.

Then I had to delete that too, because it would response with things like “You get it of course, your a…” etc.

So I wasn’t surprised over the rest of 2025 as various stories popped up.

It’s still bad. Based on what I see with quantized models and sparse attention inference methods, even with most recent GPT 5 releases OpenAI is still doing something in the area of optimizing compute requirements that makes the recent improvements very brittle— I of course can’t know for sure, only that its behavior matches what I see with those sorts of boundaries pushed on open weight models. And the assumption that the-you-can-prompt buffet of a Plus subscription is where they’re most likely to deploy those sorts of performance hacks and make the quality tradeoffs. That isn’t their main money source, it’s not enterprise level spending.

This technology is amazing, but it’s also dangerous, sometimes in very foreseeable ways, and the more time that goes the more I appreciate some of the public criticisms of OpenAI with, eg, the Amodeis’ split to form Anthropic and the temporary ouster of SA for a few days before that got undone.

jsight · 2025-12-06T03:02:25 1764990145

Yeah, it is insane how many people think that tuning models is nearly impossible, or that it requires a multibillion dollar data center.

It is one of the weirdest variations of people buying into too much hype.

socketcluster · 2025-12-06T03:39:47 1764992387

The incumbent are trying to fully control the market but they don't have a justification for that. A company like Google which already had a monopoly over search needs to convince the market that this will allow them to expand past search. If the narrative is that anyone can run a specialized model on their machines for different tasks, this doesn't justify AI companies selling themselves on the assumption of a total market monopoly and stranglehold over the economy.

They cannot sell themselves without concealing reality. This is not a new thing. There were a lot of suppressed projects in Blockchain industry where everyone denied the existence of certain projects and most people never heard about them and talk as if the best coin in existence can do a measly 4 transactions per second as if it's state of the art... Solutions like "Lightning network" don't actually work but they are pitched as revolutionary... I bet there are more people shilling Bitcoin's Lightning network than they are people actually using it. This is the power of centralized financial incentives. Everyone ends up operating on top of shared deception "the official truth" which may not be true at all.

forgotTheLast · 2025-12-06T17:55:38 1765043738

One argument against local fine-tuning was that by the time you were done training your finetune of model N, model N+1 was out and it performed your finetune out of the box. That kinda stopped being the case last year though.