Blog/analysis

Usage-based billing is the nerf pattern vendors want you to love

2026-06-02·10 min read·by gotnerfed

# Usage-based billing is the nerf pattern vendors want you to love

On June 1, 2026, GitHub Copilot moved from flat-rate pricing to usage-based billing. The framing in the announcement was the kind of thing that gets written by someone who has read a lot of board decks. “Pay for what you use.” “Aligned incentives.” “Most users will pay less.” The gotnerfed index scored it a 47. Billing model shift. Yellow on the dashboard, not red.

I think the index is undercounting this one.

Tier removals like the Claude Code event from April are loud. They generate Reddit threads, migration spikes, public anger. The vendor takes a reputation hit, the users move, the receipt closes. Usage-based shifts are different. They land softer. Most users don’t notice immediately because their bill in month one looks about the same. Then their bill in month four looks different, and by then they’ve reorganized their workflow around the tool and the switching cost is high enough that they pay.

That’s the trick. GitHub is the latest in a sequence.

Five usage-based receipts in eighteen weeks

If you ladder up the receipts from February through June 2026, the gotnerfed index has logged usage-based shifts at Devin (Feb 19), Bolt.new (Apr 2), OpenAI’s Codex product (Apr 2), and GitHub Copilot (Jun 1). DeepSeek’s V3 token price doubling (Feb 8) is technically a price-increase receipt, but it’s the same family of move because DeepSeek’s bills are entirely usage-based. That’s five vendors in eighteen weeks doing some version of the same thing.

Each rolled out differently. The shared structure is harder to miss when you line them up.

Devin shifted what counts as one “Agent Compute Unit.” The dollar price per ACU didn’t change. The number of ACUs consumed by a typical task went up about 35% in the weeks following the change. The company’s own internal docs framed this as “improved measurement accuracy.” Users on the gotnerfed comment thread framed it as a 35% silent price hike.

Bolt.new “backdated” a token rate increase, which is a specific kind of move where the new rate is announced and applied to the current billing period from its start. Users who had already burned a chunk of their budget under the old rate got a recalculated bill at the new rate retroactively. The increase itself was modest, around 16% by the gotnerfed score. The principle of the thing, where a vendor reaches backward and changes the rate on usage you’ve already incurred, is what stood out in the receipt comments.

OpenAI moved Codex from a flat included quota inside ChatGPT subscriptions to a token-based credit system. The framing was “more flexibility.” The functional effect was that the bottom 80% of Codex users got slightly cheaper outcomes and the top 20% got bills that were two to four times higher. The bill distribution is the give-away. Usage-based pricing is almost never neutral in the aggregate; it shifts cost from light users to heavy users and books the difference as either margin expansion or as recovered subsidy. The vendor chooses which.

GitHub did the largest move. Copilot’s flat $10 to $19 plans for individual developers were preserved on paper. Underneath, the included quota was retitled “free usage credits” and the tool would now overage onto a usage-based meter once the credit was burned. The published per-token rate looked reasonable until you ran the math on an active developer’s usage. The gotnerfed receipt has a chart in the comments where someone tracked a week of normal Copilot use and projected the new system would push their bill from $19 to about $63. They’ve since switched to Cursor.

Why these scored lower than they should have

The gotnerfed severity model penalizes irreversibility, lack of warning, breakage of stated promises, and difficulty of workaround. Usage-based shifts score lower than tier removals on most of those axes by design, because the vendor has plausible deniability on every dimension.

Reversibility: technically the user can cap their usage, so the bill is “reversible” in the sense that the user can choose not to use the product. The model treats this as a workaround, even though the workaround is just “use less of the thing you’re paying for.”

Warning given: usage-based shifts almost always come with an announcement, because the vendor needs the announcement to be the legal record. Tier removals can be done by deleting a feature; usage-based shifts have to be on paper. So the warning box is checked.

Stated promises: this is where the scoring gets soft. A flat-rate plan is a promise of predictability. Moving it to usage-based breaks that promise, but it doesn’t break a specific clause in the way a tier removal does. The model docks points but not as many as it should.

Workaround difficulty: the workaround for a usage-based shift is to set a budget cap or switch vendors. Both are technically straightforward, so the model scores it lower. What the model misses is that setting a budget cap means deliberately throttling your own productivity, which is a workaround that costs the user something the model can’t easily measure.

If you re-ran the GitHub Copilot receipt with a corrected weighting for “predictability promise broken,” it would probably score in the high 60s rather than 47. Worth keeping in mind when you scan the dashboard. Color is a useful signal, not a complete one.

The behavioral tax

There’s a second-order effect that doesn’t show up in any pricing page or any receipt. When a tool moves from flat-rate to usage-based, people stop using it the same way.

This sounds obvious but the effects are bigger than people think. A flat-rate Copilot user runs the tool constantly. They let it suggest completions for every line, they accept the ones that fit, they ignore the ones that don’t, they treat it as ambient. A usage-based Copilot user starts asking themselves “is this completion worth two cents” before each interaction. That cognitive overhead changes how the tool feels. The marginal cost is low in dollars but high in friction.

There’s a research line on this, mostly out of behavioral economics circles, that calls it the “meter effect.” Flat-rate utilities like electricity, water, mobile data, and home internet get consumed more aggressively than metered ones, even when the per-unit price is identical. The presence of the meter changes behavior independently of the price.

This is what vendors get from usage-based pricing that doesn’t show up in the announcement. Margin recovery is one piece. Reduced load is the other. Heavy users self-limit when they can see the meter spinning, which means the vendor’s infrastructure bill drops at the same time the vendor’s revenue per user climbs. That’s a structurally attractive trade for the vendor and a structurally bad one for the user, who is now paying more and getting less use out of the tool.

Cursor figured this out a while ago, which is why their pricing has oscillated between flat-rate, capped flat-rate, “fast vs slow request” tiering, and credit-based usage windows over two years. They’re not undecided. They’re testing which framing extracts the most revenue while minimizing the meter effect.

What the contract lets the vendor do

Usage-based pricing is paired with a specific clause structure in vendor terms of service that’s worth reading once carefully.

Almost every usage-based AI vendor reserves the right to adjust per-unit pricing with notice. The notice period varies. Anthropic’s API terms specify 30 days. OpenAI’s specify “reasonable notice,” which has been as short as 14 days in practice. DeepSeek’s terms just say “from time to time.” GitHub Copilot’s new usage-based terms specify 30 days, but with a clause that allows shorter notice for “operational reasons.”

The structural implication is that any usage-based bill you’re paying today is a snapshot of the vendor’s current pricing, not a commitment. The vendor can raise per-token rates next quarter, give 14 to 30 days notice, and continue billing you under your existing contract. You can cancel, sure. But if your workflow is wired into the tool, cancellation has a cost that the vendor is implicitly betting you’ll absorb.

This isn’t hypothetical. DeepSeek doubled its V3 output token price in February. Users who had built products on top of DeepSeek’s API got 30 days to adjust their cost models. Some swapped to Qwen or other cheaper open-weight models. Some passed the cost through to their customers. Some ate the margin hit. The point is the vendor had the option, and the option was in the contract from day one.

If you’ve moved any meaningful workload to a usage-based AI tool, your single most important defensive action is to read the price-change clause in the TOS. Don’t skim it. Read it slowly. Note the notice period. Note whether the clause references a published rate card or whether the vendor reserves discretion. Note whether there’s a price-lock option (some enterprise contracts have one; almost no consumer contracts do).

What “usage-based” hides

There’s a second tier of detail in usage-based pricing that vendors are not contractually required to surface, and almost none of them voluntarily do.

What counts as a “token” can shift. OpenAI changed its tokenizer between GPT-4 and GPT-4o, and a string of text that used to count as 100 tokens now counts as 87 in some cases and 112 in others. The price per token didn’t change. The cost of the same operation did.

What counts as one “request” in tools that bill per request can shift. Cursor has redefined what triggers a “fast request” three times since launch, each time in a direction that increases the number of fast requests consumed by a given session. The user sees the same workflow. The meter spins faster.

What counts as one “credit” in Devin-style ACU systems can shift, as the Devin February receipt documented. The model’s internal accounting for ACU consumption is opaque, and the vendor has full discretion to adjust it. They have adjusted it.

The pattern across all of these is that the unit of measurement is itself adjustable, and the adjustments tend to favor the vendor’s revenue. You won’t see this in the pricing page. You’ll see it in your bill, six weeks after the change, when your usage of the tool didn’t increase but your spend did.

What to do if you’re on a usage-based plan

A few things that work, ordered roughly by how much friction they cost you.

Set a hard budget cap. Every usage-based AI tool offers one. Most users never set one because they assume their normal usage will stay normal. The point of the cap isn’t to limit your normal usage; it’s to catch the case where pricing changes underneath you. If your budget cap is 1.5x your typical monthly spend, you’ll notice a pricing change in the same billing cycle it happens, instead of three cycles later.

Audit your bill weekly, not monthly. The first 7 to 10 days of a billing cycle are when pricing changes show their effect most clearly, because your usage pattern is comparable to the same window in the prior cycle. A 15% week-over-week unit cost increase at the same usage volume is the vendor moving, not your behavior changing. Worth catching early.

Run parallel against a cheaper or open-weight model for at least 5% of your workload. This sounds like a waste of time and it isn’t. The value is partly cost benchmarking, which keeps you honest about whether your current vendor is competitive. The bigger value is that you maintain operational familiarity with an alternative, so if the vendor does a Devin-style silent ACU redefinition, you can switch without a week of re-tooling.

Don’t trust the per-unit rate as a stable number. Treat the published per-token, per-ACU, or per-request price as the price for this quarter. Build your cost models with a 20% buffer on top. Build your fallback plans assuming the buffer will be needed within 12 months.

Don’t sign annual deals on a usage-based tool unless you have a price-lock clause in writing. The pitch for an annual deal on a usage-based plan is almost always a discount on the platform fee, with usage-based rates unlocked. That trade favors the vendor, not you.

The thing the index hasn’t priced in

If the rate of usage-based conversions continues at its current pace, the dominant AI tool pricing model by the end of 2026 will be metered, not flat. That’s not a forecast based on a vibe. It’s based on five receipts in eighteen weeks at five different vendors, each picking up where the last left off.

What this means structurally is that the predictability premium, the thing customers used to get from paying a flat $20 or $30 a month, is going away. The premium is being recaptured by vendors as variable revenue. The gotnerfed receipts can document each conversion, but the cumulative effect is hard to score because no individual receipt looks catastrophic. It’s the aggregate that matters, and the aggregate is moving in one direction.

If you’ve been keeping an eye on tier removals as the main pattern, you’ve been watching the loud thing. The quieter thing has been the usage-based wave, and it’s the one that’s reshaping the cost of running an AI-dependent product. The next 12 months of receipts will tell you whether 47 was the right score for the GitHub Copilot move, or whether it should have been higher. My bet is higher.