
Flat-rate AI subscriptions were always a bet on average usage. Providers priced for the median user, absorbed costs at the margins, and hoped the math held. For a while, it did. But the arrival of agentic AI workflows, capable of running near-continuously and consuming tokens at scale, has changed the calculus for every major platform selling AI access on a monthly fee.
GitHub, the Microsoft-owned developer platform, is now making that reckoning explicit. On April 28, the company announced that GitHub Copilot will shift to a usage-based billing model starting June 1. The stated rationale is direct: the current system, which lumps a quick chat query and a multi-hour autonomous coding session into the same billing category, is no longer financially sustainable.
Under the existing structure, Copilot subscribers receive monthly allocations of requests and premium requests, which are consumed whenever users prompt the AI. The problem GitHub identifies is that those categories obscure wildly different backend costs. A code completion suggestion and a deep reasoning task powered by a high-end GPT model carry completely different infrastructure price tags, yet both draw from the same user bucket.
The new model introduces AI Credits, with each subscriber receiving a monthly allotment matching their subscription payment. Usage beyond that allotment will be calculated based on token consumption, covering input, output, and cached tokens, billed at the listed API rates for whichever model the user is calling. Those rates vary significantly: OpenAI's GPT models currently range from $4.50 per million output tokens for GPT-5.4 Mini to $30 per million output tokens for GPT-5.5, according to figures cited in the source reporting.
The Agentic Cost Problem
The timing of this change is not incidental. AI critic Ed Zitron reported last week, citing leaked internal documents, that week-over-week costs for GitHub Copilot had nearly doubled since January. That period aligns with the rapid adoption of agentic AI assistants running multi-agent workflows, tools that keep models active far longer than a single prompt-response exchange. Flat subscriptions were never designed for that usage pattern.
What Still Comes Free
Not every Copilot feature will draw from the AI Credits pool. Simple suggestions like code completion and Next Edit remain outside the credit system. Copilot code reviews, however, will carry an additional cost denominated in GitHub Actions minutes, adding another variable to the total bill for teams that rely on automated review at scale.
A Preview Tool Before the Switch
GitHub is giving users a runway before June 1. A preview bill tool will let Copilot subscribers model how their current usage patterns would translate into charges under the new system. That is a useful concession for teams managing tight tooling budgets, but it also signals that GitHub expects some sticker shock once real numbers surface.
Platform-Wide Signals of Stress
GitHub made a parallel move last week by pausing new signups for its subscription plans, tightening usage limits, and removing Claude's Opus models from lower-tier Pro plans. The company described those changes as necessary to ensure a predictable experience for existing customers. Taken together, the actions sketch a platform under genuine infrastructure pressure, not one running a pricing experiment from a position of comfort.
Anthropic Is Doing the Same
GitHub is not acting in isolation. Anthropic has begun charging large Claude Enterprise subscribers for the full cost of computing resources rather than offering subscription-subsidized discounts. Anthropic also briefly tested removing Claude Code from its $20-per-month Pro plan and has been adjusting usage limits during peak hours between 5 am and 11 am Pacific Time to control costs and improve reliability. The pattern across providers points to a structural shift, not isolated policy tweaks.
GitHub framed the June 1 change as a move that reduces the need to gate heavy users, arguing that aligning pricing to actual usage produces a more reliable product for everyone. Whether subscribers accept that framing depends largely on where their usage falls relative to the new credit thresholds. Teams running lightweight Copilot sessions may see little change. Those running agentic pipelines will face a materially different cost structure.
If inference costs continue rising and compute shortages persist, this model could become the default across the entire AI tooling market. Agencies and studios that have built workflows around flat-fee AI access may need to audit their actual token consumption before those bills arrive, because the era of subsidized heavy usage appears to be closing faster than most anticipated.