FWIW this is what Linux and the early open-source databases (e.g. PostgreSQL and MySQL) did.
They usually lagged for large sets of users: Linux was not as advanced as Solaris, PostgreSQL lacked important features contained in Oracle. The practical effect of this is that it puts the proprietary implementation on a treadmill of improvement where there are two likely outcomes: 1) the rate of improvement slows enough to let the OSS catch up or 2) improvement continues, but smaller subsets of people need the further improvements so the OSS becomes "good enough." (This is similar to how most people now do not pay attention to CPU speeds because they got "fast enough" for most people well over a decade ago.)
Deepseek 3.2 scores gold at IMO and others. Google had to use parallel reasoning to do that with gemini, and the public version still only achieves silver.
i wasn't judging, i was asking how it works. why would openai/anthrophic/google let a competitor scrape their results in sufficient amounts that it lets them train their own thing?
I think the point is that they can't really stop it. Let's say that I purchase API credits, and I let the resell it to DeepSeek.
That's going to be pretty hard for OpenAI to figure out and even if they figure it out and they stop me there will be thousands of other companies willing to do that arbitrage. (Just for the record, I'm not doing this, but I'm sure people are.)
They would need to be very restrictive about who is allowed to use the API and not and that would kill their growth because because then customers would just go to Google or another provider that is less restrictive.
This is great on paper until some jackass wants to access their home NAS over the public frequency range so they can watch anime all day at their desk, which only works when they use multiple channels at once.
There are tons of cool things society could enjoy if it wasn't for a small handful of shameless actors.
People don't need a calculator website anymore. They can just prompt their own AI account to generate whatever calculator they need in the moment. I already have a few pinned in my favorites that I use often.
That is the real promise of AI driven software. Bespoke tiny apps available to anyone whenever they simply just ask for it.
For the foreseeable future until maybe we systems that can predict what someone will need/want for an app at any given time (a prospect as horrifying as it is awesome imo), there’ll be plenty of people, maybe even a majority, that don’t know what they want or need until it’s shown to them.
There will be many more niche applications vibe-coded by people with lots of knowledge and no coding experience/desire that people will use rather than thinking of an app themselves to create.
Then there will be people like you, me, OP and 99% of the other HN community that have a million ideas they want to create, use, and sometimes share.
There are a lot of things I don’t know about and even more I don’t know I don’t know about and in those cases, there’s still a wide open door for people to create applications and experiences that share their knowledge/vision.
I could ask Claude Code or some other future platform to build be a financial calculator every time I need it but why would I do that when someone with the benefit of prior knowledge and experience has already done that for me?
They probably included calculators I didn’t even know I needed.
Using an ad blocker just shifts the cost of creating/providing content onto people not using ad blockers.
The enshitification of the internet is largely driven by people ad blocking, as is incentivizes more click bait, more ads, and sloppier cheap content.
For engineering/software related content, the impact is immense since the audience is largely people ad blocking. I won't name names, because they fear backlash from their "ad block is awesome" audience, but some well known youtubers in the hard nerdy tech space report 40-50% of views they receive no compensation for.
So you can evangelize how great it is to not have to compensate for content, but don't think it's some kind of everyone wins victory. It's just a cost shift onto someone else, which largely manifests as bad content being needed to cover costs.
The correct approach is paying for what you use, and avoiding ad-supported content to send the message that you want a paid option.
> The enshitification of the internet is largely driven by people ad blocking
This is unfairly putting the blame on only one rational actor in a prisoner's dilemma.
Content providers are free to put their content behind a paywall with no ads, but they choose not to.
They choose not to because people don't pay for content when they can get it from other providers who don't use a paywall.
Consumers then are left without the option to pay for an ad-free experience.
But ads are run on hardware the consumer owns, consuming their resources and harvesting personal information on the consumer, which is a security concern.
So even if they want to support content creators by viewing the ads they run, they need to also accept the security trade-off, which many reasonably do not
Just a note that many creators integrate the advertisement of their sponsors into their content presentation and not all of that can be stopped by ad blockers (most, in fact, can't). Those creators also tend to have a Patreon or similar so their content can be supported directly. And I'd wager that's a much better model for the creators than relying on ads they don't even select themselves and that possibly clash with their content. This also makes it much more clear to users that the creators directly benefit from those integrated ads, so those kinds of ads are probably much better tolerated.
40-50% of people are ad-blocking some rather beloved content creators. That means, not paying for premium, and not viewing ads.
Ok, so maybe they are suscribing to patreon? Maybe Nebula?
Well those two have conversion rates around (on a good day) 1%.
You can swim in the waters of cognitive dissonance because ads really do suck and ad block is a great way to stop the pain while still getting what you want.
Understand though, the statistics are so damning against the ad-block crowd, that you come off like the people screeching about human generated CO2 being totally fine for the environment (It helps plants grow!) because they cannot imagine having to give up commuting in their diesel monster pick-up truck everyday. (Ad block does no damage because I cannot imagine having to see ads...)
As an aside, ironically, security nightmare ads are really only served to people with tracking blockers, because those people are the lowest value visitors and only scammers/bottom feeders really bid on their views. Regular tech illiterate people get ads for Tide and Toyota. The more you know.
>> The enshitification of the internet is largely driven by people ad blocking, as is incentivizes more click bait, more ads, and sloppier cheap content.
In Bizarro World. In our world, enshitification of the internet is driven predominantly by ads. For example, click bait, more ads and sloppier cheap content are all motivated by the need to create ever more content in order to serve ever more ads.
In the same way, spam blockers don't cause more spam, vaccines don't cause more disease, eating fish deosn't cause more fires, etc.
The internet is shitty in many ways and ads are one reason. You can pay for ad-free streaming but still get low bitrate although you paid enough to cover traffic costs for higher bitrate. You can pay to have ad-free instagram but still see all this shitty AI-generated crap and bot posts. You can pay for Youtube Premium but Google will still massively invade your privacy.
Do you really think that if everybody turned off their ad blockers and paid for premium services, the internet would become better? The way I see it, corporate greed would milk consumers even more.
Instead of surrendering to ads, we should promote directly donating to (or supporting) YouTubers or websites that provide value to us.
Perennial doesn't make sense in the context of something that has been around for a few months. Observations from the spring 2025 crop of LLMs are already irrelevant.
>One wonders if some professional mathematicians are instead choosing to publish LLM proofs without attribution for career purposes.
This will just become the norm as these models improve, if it isn't largely already the case.
It's like sports where everyone is trying to use steroids, because the only way to keep up is to use steroids. Except there aren't any AI-detectors and it's not breaking any rules (except perhaps some kind of self moral code) to use AI.
People seem to have this belief, or perhaps just general intuition, that LLMs are a google search on a training set with a fancy language engine on the front end. That's not what they are. The models (almost) self avoid copyright, because they never copy anything in the first place, hence why the model is a dense web of weight connections rather than an orderly bookshelf of copied training data.
Picture yourself contorting your hands under a spotlight to generate a shadow in the shape of a bird. The bird is not in your fingers, despite the shadow of the bird, and the shadow of your hand, looking very similar. Furthermore, your hand-shadow has no idea what a bird is.
For a task like this, I expect the tool to use web searches and sift through the results, similar to what a human would do. Based on progress indicators shown during the process, this is what happens. It's not an offline synthesis purely from training data, something you would get from running a model locally. (At least if we can believe the progress indicators, but who knows.)
While true in general, they do know many things verbatim. For instance, GPT-4 can reproduce the Navy SEAL copypasta word for word with all the misspellings.
I'd almost agree if the volume on $SPY zero day options wasn't so immense.
reply