More

PhilippGille · 2026-05-05T06:04:56 1777961096

Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:

https://huggingface.co/spaces/mteb/leaderboard

PhilippGille · 2026-04-24T14:26:17 1777040777

When you say "Gemini", which exact model do you mean? You know there are several and they vary a lot in how capable they are? Pro 3.1 Preview, 2.5 Pro (their latest non-preview pro model), Flash 3 Preview, ...

Same with GPT-5: Latest 5.5, prior 5.4, or actually the original 5 (.0)?

You can't talk about model performance without specifying the exact model.

hodgehog11 · 2026-04-24T15:27:38 1777044458

My apologies, I thought it would be implicit that I am using the top-tier model of the time given the challenge of the tasks. GPT-5.5 was too new in this top comment (although I did test it a bit in a comment below), so I was using GPT-5.4. Gemini is Pro 3.1 Preview.

WarmWash · 2026-04-24T14:39:23 1777041563

High bet on 3.1 pro. I use it a lot for math and classic engineering, it's very strong.

PhilippGille · 2026-04-12T08:30:44 1775982644

> C# [...] only really works properly in Windows

What do you mean with this? Maybe you are thinking of the old ".NET Framework" runtime, which only runs on Windows? Nowadays there is ".NET Core" which runs on macOS and Linux as well.

jurschreuder · 2026-04-12T19:12:08 1776021128

Even om Windows .NET does not work properly with mySQL and Postgres it only really works properly with Microsoft MySQL-Clone or I don't know the official name.

leosanchez · 2026-04-13T04:10:37 1776053437

It works pretty well with Postgres, SQLite and MySQL. You don't know what you are talking about.

PhilippGille · 2026-04-12T07:49:57 1775980197

He specifically mentions that he is using GitHub Copilot because of how Microsoft bills per request instead of token.

PhilippGille · 2026-04-10T06:21:30 1775802090

> it is possible with some software to have everything massively cached, with the cloud doing that, with the origin server in my basement, only accessible from the allowed cache arrangement

Do you mean a setup like:

    client -> cloud(HAProxy+Varnish) -WireGuard-> basement(backend)

Or something else?

PhilippGille · 2026-04-07T21:28:21 1775597301

If you are interested in scale models of New York, there's a 1:1 scale model in Minecraft: https://youtu.be/ZouSJWXFBPk

bigweeble · 2026-04-07T21:52:04 1775598724

Whoa. I admire the time and dedication to both models. However, I can't help but LOVE the minecraft model since it will live on. Now we just need to 3D print the minecraft model :D

j_bum · 2026-04-08T00:39:23 1775608763

Incredible effort… thanks for sharing this!

I’d love to learn more about the technical challenges. For example, how do they handle buildings that aren’t perfectly aligned to the cardinal axes?

fixxation92 · 2026-04-08T12:09:47 1775650187

Wow, that is phenomenal. I don't play Minecraft, but my kids do- so I can see them and can appreciate all the work that went into this. Fair play!

kmoser · 2026-04-08T03:33:33 1775619213

Is there a way to visit this online, without having to download and install software locally?

hydrolox · 2026-04-08T16:42:29 1775666549

online map https://progress.minefact.de/

kmoser · 2026-04-09T19:51:52 1775764312

Thank you. Seems to be down, though; web dev tools indicate unable to open websocket to wss://progressbackend.minefact.de/socket.io/?EIO=4&transport=websocket&sid=...

PhilippGille · 2026-04-05T18:50:08 1775415008

The article tries to sell it to people who can't run Docker locally (e.g. locked down permissions in enterprise environments, slow old laptop), but hasn't it already been possible to use remote Docker engines?

So the news is that they're offering to host those remotes now, right?

justsomehnguy · 2026-04-05T20:27:54 1775420874

Nah. It's just 15 years later they finally try to find a niche would also bring them monies. There are a lot of business who would just offload (yes, I did it too) the burden of compliance to a 3rd party - and this is the reason it's mentioned quite prominently there.

Good for them but they should had done this ten years ago.

shanemhansen · 2026-04-06T07:06:54 1775459214

They said bind mounts would still work. I didn't think those worked with remote engines.

Which also seems to imply the client software will expose your laptops filesystem to wherever docker is hosting the serverside piece of Offload.

solarkraft · 2026-04-06T10:27:45 1775471265

Hopefully only the bound folders.

PhilippGille · 2026-04-05T16:55:19 1775408119

Is this not just about extra credit? So what's included in the subscription doesn't change - just extra credits are now token based instead of message based? (For Plus/Pro)

nba456_ · 2026-04-05T17:34:48 1775410488

God every single title I read about AI on this site ends up being a straight up lie.

camdenreslink · 2026-04-05T17:44:37 1775411077

I think this might also impact how usage is calculated for subscription plans as well, not just overages (using tokens instead of messages for calculating usage). But the message from OpenAI seems vague.

sixtyj · 2026-04-05T17:40:52 1775410852

I miss “BREAKING NEWS” as it is used at X /s

raincole · 2026-04-05T17:20:17 1775409617

Yes.

> This format replaces average per-message estimates with a direct mapping between token usage and credits.

It's to replace the opaque, per-message calculation, not the subscription plan.

liuliu · 2026-04-05T17:30:14 1775410214

It does feel like also impact the usage meter for subscription plans?

raincole · 2026-04-05T17:33:37 1775410417

Usage meter has always been completely opaque anyway. They could (and probably did) shrink the limit whenever they like.

mrtesthah · 2026-04-05T17:45:21 1775411121

Ostensibly this makes usage meter rate changes more transparent?

liuliu · 2026-04-05T18:16:08 1775412968

It is a bit insidious that the price hike coincide with the end of 2x promotion, which makes the usage change a bit more obscure.

HumanOstrich · 2026-04-05T18:34:19 1775414059

It's not a price hike, it's actually making it easier to understand relative usage for different models/features.

thejazzman · 2026-04-05T19:34:52 1775417692

I have no idea what I’m getting for $200/mo at this point. Maybe that’s on me, idk.

HumanOstrich · 2026-04-05T21:55:49 1775426149

Fair point. We only have clear evidence they're being more transparent about credit pricing and value, but it's unclear whether that'll make people burn through usage faster or slower.

The fuzziness is intentional. It gives them wiggle room and obscures how much "value" you actually get from $200, a 5-hour block, or a week. That keeps the tension manageable between subscription pricing and pay-per-token API pricing, especially for larger businesses on enterprise plans who want transparent $-per-MTok rates.

If they were fully transparent, like "your $200 sub gets you up to $2,000 of equivalent API usage," it would be a constant fight. People would track pennies and scream any time 5-hour blocks got throttled during peak hours. Businesses would push harder for pay-per-token discounts seeing that juicy $200 sub value.

ssl-3 · 2026-04-05T21:53:56 1775426036

I have no idea what I'm getting for $20/mo, either. (But I do know that it's at least $180 less than what I could be spending, I suppose.)

johanyc · 2026-04-07T07:54:31 1775548471

Looks like it yes. I can't believe I have to scroll this far too find this comment.

https://help.openai.com/en/articles/12642688-using-credits-f...

PhilippGille · 2026-04-02T15:19:26 1775143166

The OpenRouter usage stats indicate the opposite: https://openrouter.ai/rankings?view=month

jjice · 2026-04-02T15:27:18 1775143638

OpenRouter usage is likely skewed towards LLMs that are more niche and/or self-hostable by solid hardware that's available, but most consumers don't have on hand. I can imagine Anthropic and OpenAI LLMs often get called directly from their APIs instead.

At least from my experience and friends of mine, we use OpenRouter for cases where we want to use smaller LLMs like Qwen, but when I've used ChatGPT and Claude, I use those APIs directly.

senordevnyc · 2026-04-02T16:36:07 1775147767

Same, and my little SaaS is pushing more than 0.1% of the TOTAL volume of tokens on OpenRouter, so the reality is they’re TINY.

imtringued · 2026-04-03T09:06:02 1775207162

0.1% of OpenRouter is around 400 billion tokens per month or around $400k per month at a cost of $1 per 1 million input tokens, not counting output.

I think it's pretty disingenuous to call your SaaS little when it is projected to spend at least 5 million USD just on tokens and this is a low end estimate.

senordevnyc · 2026-04-04T15:23:58 1775316238

Their homepage says 30T tokens monthly, so 0.1% would be 30 billion.

And I pay way less than $1 per input token, especially when caching is taken into account.

EDIT: they updated it in the last day or two, now it says 70T, so I’m a little below 0.1% now. But seriously, the point stands, 70T tokens a month just isn’t that much in the global scheme. The big labs are pushing quadrillions each.

cap11235 · 2026-04-04T15:26:23 1775316383

Ill sell you tokens for just 1 cent each, as many as you want. Bargain!

elbear · 2026-04-02T18:39:22 1775155162

I use ChatGPT and Claude on OpenRouter, because it's just easier than buying credits on each platform separately.

vorticalbox · 2026-04-02T16:25:29 1775147129

what happened around jan this year(26) that caused such a climb in usage?

wcallahan · 2026-04-02T17:47:59 1775152079

Openclaw

PhilippGille · 2026-04-02T06:29:31 1775111371

This is their MBP 14" M5 Max review, with a "Battery life" section and their standard web browsing test: https://www.notebookcheck.net/M5-Max-with-inconsistent-perfo...

15h 10min

dcrazy · 2026-04-03T02:13:24 1775182404

Thank you for finding that. That seems like a much more plausible comparison.

Edit: wait, that’s the M5 Max. What about the stock M5?