Anthropic and OpenAI just found product-market fit — the $2,180/month bill Simon Willison made public, and what builders should do about it

Last updated: May 28, 2026. Read time: 8 minutes. What you’ll learn: Simon Willison’s HN-trending PMF post lays out the case that Anthropic and OpenAI have finally hit product-market fit — not on chatbots, but on coding agents. We unpack the receipts (Simon’s $2,180/month personal bill, $10.9B Anthropic Q2, $1.25B/month SpaceX contract, 25% of Uber commits from Claude Code), and walk through the 5 stack changes builders should consider in response.

Simon Willison — ex-Eventbrite Engineering Director, Datasette creator, most-followed AI builder blog on the planet right now — published a post on May 27 titled “I think Anthropic and OpenAI have found product-market fit.” It hit 692↑ on HN in 10 hours.

The thesis is simple and quotable:

“I think they’ve finally found product-market fit, with the coding/general-purpose agent products embodied by Claude Code/Cowork and Codex.”

If you’ve been on the fence about restructuring your AI stack around coding agents specifically — instead of generic chat APIs — Simon’s post is the evidence you’ve been waiting for. This article walks through the data, why it matters, and the 5 concrete stack changes builders should consider this week.


1. The receipts (these are real numbers, not vibes)

Simon’s own monthly bill, made public:

  • Claude Code: $1,199.79 in 30 days
  • OpenAI Codex: $980.37 in 30 days
  • Combined: $2,180.16/month

This is one developer’s personal usage on coding agents specifically, not API exploration, not chatbot subscriptions. It’s the “I’d pay this every month forever” willingness-to-pay that Simon’s own PMF definition demands:

“Your customer should suck air through their teeth and then say yes.”

The big numbers behind it:

Data pointSource / context
$10.9BAnthropic’s projected Q2 2026 revenue (the “revenue inflection”)
$1.25B/month for 3 yearsSpaceX × Anthropic contract (May 2026 – May 2029)
25% of Uber Q1 commitsCame from Claude Code (per public reporting)
$1.2B / $4B of 2025 Anthropic revenueCame from Cursor + GitHub Copilot (their pass-through customers)
5.6% of ChatGPT’s 900M WAU pay~50M paid users (the consumer-tier benchmark)
32.6% of OpenAI’s 703 open rolesAre enterprise-sales positions
26.9% of Anthropic’s 390 open rolesAre enterprise-sales positions

The convergent signal: both labs are pivoting hard toward “coding agents sold to enterprise” as the actual revenue engine. ChatGPT consumer subs are footnote-scale next to enterprise Codex/Claude Code seats.

Inflection point: Simon explicitly calls out April 2026 as the moment. Both labs released new model tiers, both raised API prices. The reason they could raise prices: the enterprise buyer “sucked air through their teeth and said yes.”


2. Why this matters specifically to builders

If Simon’s read is right (and the receipts are hard to argue with), three structural assumptions about your AI stack are now broken:

Assumption 1: “We’ll just use the cheapest API”

If Anthropic and OpenAI have PMF on enterprise coding agents, their API pricing power increased in April 2026 and will keep increasing. The “wait it out, prices always drop” thesis is dead for these two providers. Plan for stable or rising prices through 2027.

Assumption 2: “All AI labs converge on the same products”

They didn’t. OpenAI’s PMF lever is Codex (API-priced from April 2026), Anthropic’s PMF lever is Claude Code / Cowork (subscription-tiered after the Nov 2025 redesign). Google, Meta, xAI, DeepSeek haven’t reached comparable PMF in this category — they’re each making different bets. Builder implication: your model choice now embeds a bet on which lab’s coding-agent strategy you trust most for the next 18 months.

Assumption 3: “Microsoft running Copilot means Microsoft is the customer”

Microsoft just canceled internal Claude Code licenses — but Uber pays $X for 25% of their commits to come from Claude Code. Different enterprise buyers will make wildly different cost/quality tradeoffs. Don’t extrapolate from any single big-name customer; the market is fragmenting.


3. The 5 stack changes builders should consider this week

If you’ve been deferring decisions on AI tooling, Simon’s post is the prompt to act. Here’s the practical translation:

Change 1 — Stop trying to abstract over vendors

If you’re building on a generic “LLM abstraction layer” so you can swap providers later, you’re paying complexity cost for an option you probably won’t exercise. Anthropic and OpenAI are the PMF leaders for coding work; pick one as your primary, build features that exploit its specific surface (Claude’s claude.md + skills + subagents, or OpenAI’s responses API + structured outputs), and move on. The optionality wasn’t worth what it cost you.

Change 2 — Architect for cost-tier handoff, not single-tier

Simon spends $2,180/month and it’s worth it for him personally. Your product can’t pass that through to most users. The right pattern, as we covered in the Cactus Hybrid Router writeup, is on-device pre-handle the easy cases, route the hard cases to a frontier model. Builders shipping consumer or pro-sumer products should treat $2,180/month/user as the worst-case cost ceiling, not the median.

Change 3 — Don’t sleep on the open-weight discount tier

DeepSeek V4 Pro at the permanent 75% discount and Reasonix’s 99.82% cache-hit benchmark put coding-agent costs 5-10× below Anthropic / OpenAI tier. Where the workload doesn’t need frontier reasoning, the open-weight stack is shippable now. The Anthropic/OpenAI PMF Simon describes is despite DeepSeek’s price competition — meaning the enterprise buyer is paying for quality and integration, not for the only-available-option.

Change 4 — Take “agent quality” seriously, the way EURECOM measured it

The PMF Simon describes is for coding agents — a category where reliability decay is a real phenomenon, losing 30 percentage points from “unconstrained” to “production-grade.” Enterprise buyers will eventually demand SLAs around this. If you’re building agent products, instrument now for: assertion pass rates, framework-specific success rates, data-layer error categories. The buyers will catch up to the metric in 6-12 months.

Change 5 — Run Nolan’s multi-agent review loop on any code you ship

Simon spending $2,180 on Claude Code + Codex isn’t because the models work perfectly — it’s because the human-in-the-loop review makes them work. Builders shipping AI-assisted code at scale need a multi-agent review pattern, not just a single-shot generator. We walked through Nolan Lawson’s actual workflow in detail.


4. The risks Simon doesn’t quite name

For honesty, three risks I’d add to Simon’s read:

  1. Two-vendor dependency. If your entire backend depends on Anthropic + OpenAI, you have two single points of failure. The PMF makes them less likely to disappear, but more likely to raise prices unilaterally.
  2. PMF moments can reverse. Cursor’s pre-IPO valuation in late 2025 was partially based on Anthropic API pass-through margin. When Anthropic raised prices in April 2026, that margin compressed — and Cursor adapted (badly, per builder reports). Your product can hit similar margin shocks; budget for them.
  3. The 5.6% ChatGPT paid rate is the real ceiling for consumer AI. Even at PMF, ~94% of users won’t pay. If you’re building consumer AI, your monetization model has to assume the same conversion rate — and the same cost ceiling.

5. What this slots into

You’re now reading the 6th article in a tight SEO cluster on the 2026 builder coding-agent stack:

If you want the comparison-shopping take on the public-API tier that competes for the budget Simon describes, see our Gemini 3.5 Flash vs Claude Haiku 4.5 deep dive.


The bigger picture

Simon’s post is the cleanest public articulation I’ve seen of a thing that was obvious from the inside of any well-tooled builder team for the past 3-4 months: coding agents work, builders pay, and the labs that built the best coding agents are about to print money for the next 18-24 months.

The builder response to this isn’t “don’t use Claude Code or Codex” — Simon obviously thinks the spend is worth it for the work he gets out. The builder response is structure your stack so the $2,180/month is the worst-case cost, not the every-case cost: handoff tiers, multi-agent review for quality, open-weight for the volume cases, frontier-tier for the cases that need it.

For PMF moments in adjacent sectors (image generation, voice, on-device), we’ll cover those individually as the receipts come in.


Sources