AI Firms Tighten Consumer Access Amid Compute Crunch

Theo - t3.gg2026-05-05go watch the original →

the gist

AI labs like Anthropic and Microsoft are restricting consumer-tier usage not for revenue but to preserve scarce compute for high-value enterprise customers, ending heavy subsidies that powered dev tools like Claude and Copilot.

Compute Scarcity Drives Pricing Shifts, Not Greed

Theo agrees with Prime that cracks are appearing in the AI economy but clarifies they're rooted in compute constraints rather than pure profit motives. Early signs emerged last July when Cursor abandoned message-based pricing—where users got a fixed number of requests regardless of cost—because heavy users burned thousands in inference on light plans while others used pennies. "Some users would do $10 of inference with their 200 messages. Others would do thousands of dollars of inference with their 200 messages," Theo notes, explaining Cursor's pivot to compute-reflective pricing as they lack labs' GPU ownership to subsidize losses. GitHub Copilot clung to this flawed model longer, offering 1,500 messages on $40/month, but a single crypto challenge ate two hours and ~$100 compute. Multipliers like GPT-4.5 at 1x vs. 4o-mini at 7.5x exposed the mismatch: pricing by messages ignores runtime and model expense, akin to "Walmart charging by cart items, where some grab candy and others five TVs."

Anthropic's moves exemplify compute rationing. They tested off-peak doubling usage to shift power users away from workday peaks, where fixed GPUs get saturated (e.g., 95/100 busy daytime vs. more available off-hours). When that failed, they accelerated 5-hour session limits during 5-11am PT peaks. The recent "painted door test" hid Claude Code from $20 tiers to claw back compute from low-end users, prioritizing enterprise growth—their real revenue driver. Consumer subs subsidize up to 25x ($5k inference on $200 plans), but as Cursor audits show, this trades GPUs for marketing visibility.

Subsidies Fuel Dev Adoption but Mask True Costs

Labs aggressively subsidize consumer tiers as a loss-leader for enterprise deals, but electricity alone—15-20% of API spend—turns heavy users into net losses. Theo's 5090 GPU spiked his $1,800/month San Francisco bill by $1,000; scaled up, 24/7 high-end runs cost $500+/month even in cheap US locales. Beyond power, amortization looms: pre-training bakes data into parameters at billions-scale, while post-training (RLHF/RLVR) refines behavior cheaper, excelling at code, agents, and tools.

Opus 4.5 likely hit new pre-training (3x cheaper, smarter), birthing 4.6/4.7 as costlier post-tunes. GPT-4o-mini to 4o felt iterative; 4.5 was revolutionary, warranting GPT-6 naming. Exceptions like Grok-4's RL-heavy spend or Cursor's Composer 2 (4x post over Kimi's pre) prove post-training's punch, but labs release 5-6 internals post-pretrain, keeping iteration affordable.

Dario Amodei frames it starkly: treat models as "separate companies." A $100M 2023 model earning $200M revenue profits despite firm-wide losses from scaling next-gen ($1B train for $2B revenue). "If you look at each model... the model that was trained in 2023 was profitable... even if you add inference costs... you're kind of in a good state." This per-model lens shows viability if revenue scales with compute bets.

Enterprise Priority Exposes Consumer Limits

Microsoft's Copilot shift—pausing signups, ditching messages for compute pricing—signals capacity crisis. They don't need revenue from $40 users; they need Azure GPUs for enterprise. Historically light Copilot load allowed subsidies, but growth strained reservations. "You pause signups because you don't have capacity," Theo stresses. Labs ignore $20-to-$100 upsells; subscriptions market to enterprises, but compute hoarding now bites.

Prime's right on shaky foundations—Anthropic's test, Copilot pause—but misses history (Cursor pioneered fixes) and nuance: not revenue squeezes, but GPU fights. OpenAI's $120B runway burns $5-7B/month; changes test viability amid scaling laws demanding perpetual reinvestment.

"The problem isn't that there isn't enough money... All they care about is their enterprise customers that they make actual money from."

Key Takeaways

AI pricing pivots stem from compute bottlenecks for enterprise, not consumer nickel-and-diming—prioritize off-peak or higher tiers to sustain access.
Message-based pricing fails spectacularly for variable-cost inference; demand compute-proportional models from tools like Copilot.
Consumer subs are 20-25x subsidized marketing plays—extract max value now before full clawback.
Electricity is 15-20% of API costs; heavy local GPU runs demand power audits (e.g., $500-1k/month per rig).
Per-model profitability (pre-train amortized over revenue) holds if scaling laws deliver; post-training iterations are cheaper bets for capability jumps.
Cursor's early switch proves non-labs can't subsidize—audit $/compute ratios across providers.
Peak-hour limits and tier gates ration fixed GPUs; enterprise trumps all.
Watch for more pauses/sign-up halts as signals of capacity walls over cashflow woes.
Post-training (RLHF) drives agent/code gains efficiently—fine-tunes like Composer 2 outperform naive pre-trains.
Labs release polished internals post-pretrain; rapid drops aren't always billion-dollar overhauls.