these subscriptions have limits.. how could someone use $200 worth on $20/month.. is that not the issue with the limits they set on a $20 plan, and couldn't a claude code user use that same $200 worth on $20/month? (and how do i do this?)
The limits in the max subscriptions are more generous and power users are generating loss.
I'm rather certain, though cannot prove it, that buying the same tokens would cost at least 10x more if bought from API. Anecdotally, my cursor team usage was getting to around 700$ / month. After switching to claude code max, I have so far only once hit the 3h limit window on the 100$ sub.
What Im thinking is that Anthropic is making loss with users who use it a lot, but there are a lot of users who pay for max, but don't actually use it.
With the recent improvements and increase of popularity in projects like OpenClaw, the number of users that are generating loss has probably massively increased.
I've spent $17.64 on on-demand usage in cursor with an estimated API cost of $350, mostly using Claude Opus 4.5. Some of this is skewed since subagents use a cheaper model, but even with subagents, the costs are 10x off the public API costs. Either the enterprise on-demand usage gets subsidized, API costs are 10x higher, or cursor is only billing their 10% surplus to cover their costs of indexing and such.
edit: My $40/month subscription used $662 worth of API credits.
oh, I figured out the costs for the enterprise plan. It's $0.04 per request, I'm not charged per token at all. The billing is completely different for enterprise users than regular users.
This exactly. I think this is why Anthropic simply don’t want 3rd party businesses to max out the subscription plans by sharing them across their own clients.
I'd agree on this. I ended up picking up a Claude Pro sub and am very less than impressed at the volume allowance. I generally get about a dozen queries (including simple follow up/refinements/corrections) across a relatively small codebase, with prompts structured to minimize the parts of the code touched - and moving onto fresh contexts fairly rapidly, before getting cut off for their ~5 hour window. Doing that ~twice a day ends up getting cut off on the weekly limit with about a day or two left on it.
I don't entirely mind, and am just considering it an even better work:life balance, but if this is $200 worth of queries, then all I can say is LOL.
Bumping into those limits is trivial, those 5 hour windows are anxiety inducing, and I guess the idea is to have a credit card on tap to pay for overages but…
I’m messing around on document production, I can’t imagine being on a crunch facing a deadline or dealing with a production issue and 1) seeing some random fuck-up eat my budget with no take backs (‘sure thing, I’ll make a custom docx editor to open that…’), 2) having to explain to my boss why Thursday cost $500 more than expected because of some library mismatch, or 3) trying to decide whether we’re gonna spend or wait while stressing some major issue (the LLM got us in it, so we kinda need the LLM to get us out).
That’s a lot of extra shizz on top of already tricky situations.
The usage limit on your $20/month subscription is not $20 of API tokens (if it was, why subscribe?). Its much much higher, and you can hit the equivalent of $20 of API usage in a few days.
Their bet is that most people will not fill up 100% of their weekly usage for 4 consecutive weeks of their monthly plan, because they are humans and the limits impede long running tasks during working hours.
I do believe it's unreasonable. The limits are the limits, you reach them there's no more free lunch after.
Fix the limits, so the limits are reached at a rate that sustains their business.. ? obviously this WILL happen eventually when they need to pay for things.
"A rate that sustains their business" at the moment probably looks like API pricing or maybe even higher. That means subscriptions get significantly more expensive and/or limited, which is maybe where things are headed.
The median subscriber generates about 50% gross margin, but some subscribers use 10x the amount of inference compute as other subscribers (due to using it more...), and it's a positive skewness distribution.