When you say "Gemini", which exact model do you mean? You know there are several and they vary a lot in how capable they are? Pro 3.1 Preview, 2.5 Pro (their latest non-preview pro model), Flash 3 Preview, ...
Same with GPT-5: Latest 5.5, prior 5.4, or actually the original 5 (.0)?
You can't talk about model performance without specifying the exact model.
My apologies, I thought it would be implicit that I am using the top-tier model of the time given the challenge of the tasks. GPT-5.5 was too new in this top comment (although I did test it a bit in a comment below), so I was using GPT-5.4. Gemini is Pro 3.1 Preview.
What do you mean with this? Maybe you are thinking of the old ".NET Framework" runtime, which only runs on Windows? Nowadays there is ".NET Core" which runs on macOS and Linux as well.
Even om Windows .NET does not work properly with mySQL and Postgres it only really works properly with Microsoft MySQL-Clone or I don't know the official name.
> it is possible with some software to have everything massively cached, with the cloud doing that, with the origin server in my basement, only accessible from the allowed cache arrangement
Whoa. I admire the time and dedication to both models. However, I can't help but LOVE the minecraft model since it will live on. Now we just need to 3D print the minecraft model :D
Thank you. Seems to be down, though; web dev tools indicate unable to open websocket to wss://progressbackend.minefact.de/socket.io/?EIO=4&transport=websocket&sid=...
The article tries to sell it to people who can't run Docker locally (e.g. locked down permissions in enterprise environments, slow old laptop), but hasn't it already been possible to use remote Docker engines?
So the news is that they're offering to host those remotes now, right?
Nah. It's just 15 years later they finally try to find a niche would also bring them monies. There are a lot of business who would just offload (yes, I did it too) the burden of compliance to a 3rd party - and this is the reason it's mentioned quite prominently there.
Good for them but they should had done this ten years ago.
Is this not just about extra credit? So what's included in the subscription doesn't change - just extra credits are now token based instead of message based? (For Plus/Pro)
I think this might also impact how usage is calculated for subscription plans as well, not just overages (using tokens instead of messages for calculating usage). But the message from OpenAI seems vague.
Fair point. We only have clear evidence they're being more transparent about credit pricing and value, but it's unclear whether that'll make people burn through usage faster or slower.
The fuzziness is intentional. It gives them wiggle room and obscures how much "value" you actually get from $200, a 5-hour block, or a week. That keeps the tension manageable between subscription pricing and pay-per-token API pricing, especially for larger businesses on enterprise plans who want transparent $-per-MTok rates.
If they were fully transparent, like "your $200 sub gets you up to $2,000 of equivalent API usage," it would be a constant fight. People would track pennies and scream any time 5-hour blocks got throttled during peak hours. Businesses would push harder for pay-per-token discounts seeing that juicy $200 sub value.
OpenRouter usage is likely skewed towards LLMs that are more niche and/or self-hostable by solid hardware that's available, but most consumers don't have on hand. I can imagine Anthropic and OpenAI LLMs often get called directly from their APIs instead.
At least from my experience and friends of mine, we use OpenRouter for cases where we want to use smaller LLMs like Qwen, but when I've used ChatGPT and Claude, I use those APIs directly.
0.1% of OpenRouter is around 400 billion tokens per month or around $400k per month at a cost of $1 per 1 million input tokens, not counting output.
I think it's pretty disingenuous to call your SaaS little when it is projected to spend at least 5 million USD just on tokens and this is a low end estimate.
Their homepage says 30T tokens monthly, so 0.1% would be 30 billion.
And I pay way less than $1 per input token, especially when caching is taken into account.
EDIT: they updated it in the last day or two, now it says 70T, so I’m a little below 0.1% now. But seriously, the point stands, 70T tokens a month just isn’t that much in the global scheme. The big labs are pushing quadrillions each.
https://huggingface.co/spaces/mteb/leaderboard
reply