RSSUpdated 1 hour ago
AI Model Market Splits as OpenAI Doubles Prices and DeepSeek Undercuts

AI Pricing War

AI Model Market Splits as OpenAI Doubles Prices and DeepSeek Undercuts

In 24 hours, OpenAI doubled GPT‑5.5 pricing while DeepSeek launched V4 at one‑ninth the cost. The comfortable middle tier of AI models is vanishing, forcing developers to choose between premium integrated stacks and cheap open‑weight alternatives. Here's what the split means for builders.

The 24‑Hour Price Split That Broke the AI Middle

Within 24 hours in late April 2026, the AI model market split into two distinct economies. On April 23, OpenAI launched GPT‑5.5 at $5/million input tokens and $30/million output tokens — exactly double the GPT‑5.4 rate. The next day, DeepSeek released V4‑Pro at $1.74/million input and $3.48/million output, with a V4‑Flash tier at $0.14/$0.28.

V4‑Pro output tokens cost roughly one‑ninth as much as GPT‑5.5 output at list price. The price gap between premium and open‑weight models is now wider than it has been in years, and the comfortable middle tier that most coding agents routed through is thinning out.

The New Price Table

Here's how the frontier market looks after both launches:

ModelInput (per 1M tokens)Output (per 1M tokens)Context
OpenAI GPT‑5.5$5.00$30.001M tokens
Anthropic Opus 4.7$5.00$25.001M tokens
DeepSeek V4‑Pro$1.74$3.481M tokens
DeepSeek V4‑Flash$0.14$0.281M tokens

Anthropic's Opus 4.7 sits in the premium cluster alongside OpenAI, while DeepSeek occupies the budget cluster with no meaningful competition at its price point. There is no model priced in the $5–15/million output range that offers competitive performance — that middle ground has simply vanished.

What OpenAI Is Actually Selling

OpenAI's pricing isn't just about tokens — it's about selling an integrated stack. GPT‑5.5 is the centerpiece of a system that includes Codex (with expanded computer use, browser interaction, and longer agentic runs), ChatGPT across Plus/Pro/Business/Enterprise tiers, and the API. According to The New Stack, "OpenAI is not selling tokens. It is selling outcomes, and outcomes are now priced accordingly."

The cadence matters too: GPT‑5.4 shipped in early March, and GPT‑5.5 followed seven weeks later. That's enterprise procurement strategy, not a benchmark race. Cheaper tiers (GPT‑5.4, GPT‑5.4 mini/nano, Batch, Flex, Priority, cached input) remain on the price list, but the flagship is what coding agents default to.

GPT‑5.5 scores 82.7% on Terminal‑Bench 2.0, up from 75.1% on GPT‑5.4. OpenAI claims token efficiency offsets the price hike, though no precise effective‑cost figure has been published.

What DeepSeek Is Actually Shipping

DeepSeek made three strategic decisions that distinguish V4 from the premium cluster:

  • Architecture: V4‑Pro uses 1.6T total parameters with 49B active per token (Mixture of Experts). V4‑Flash runs 284B total / 13B active. A hybrid attention scheme combining compressed sparse attention with heavily compressed attention reduces 1M‑token inference FLOPs and KV cache.
  • Distribution: MIT license — the most permissive open‑source license available. Anyone can download, host, fine‑tune, and embed commercially. As The New Stack notes, DeepSeek is betting that frontier intelligence becomes infrastructure the way Linux did.
  • Hardware independence: On launch day, Huawei announced that Ascend supernodes offer full support for V4 inference. Reuters reported V4 was adapted for Huawei's most advanced Ascend AI chips, with Huawei chips used for part of V4‑Flash's training. SMIC stock jumped 10% in Hong Kong; Hua Hong Semiconductor rose 15%.

Performance is competitive: V4‑Pro hits 80.6% on SWE‑bench Verified, within striking distance of Claude Opus 4.6. But there's a critical caveat: V4 is text‑only at launch — not a drop‑in replacement for multimodal workloads.

Three Concrete Shifts for the Builder Layer

This price gap doesn't just affect API bills — it reshapes how developers build with AI. The New Stack identifies three concrete shifts:

  • Agent platforms become model‑agnostic by necessity. A coding agent that uses GPT‑5.5 for planning and V4‑Flash for bulk‑editing is no longer exotic — it's the obvious architecture at this price gap. DeepSeek notes V4 is optimized for agent tools including Claude Code and OpenClaw.
  • Self‑hosting math changes for the first time in two years. V4‑Flash at 284B total / 13B active runs on multi‑GPU setups that mid‑size teams can afford. The trade‑off: give up managed reliability for predictable inference costs and full model control.
  • The Nvidia-only assumption is weakening. A frontier‑tier model shipping optimized for non‑Nvidia silicon expands the long‑run set of viable inference targets and tightens Nvidia's timeline on the China question.

The Missing Middle: Where Do Builders Go?

The disappearance of the middle tier creates a real dilemma for developers who don't fit neatly into either cluster:

  • Startups building vertical products need frontier intelligence but can't justify $30/million output tokens. For them, the DeepSeek V4‑Pro tier at $3.48/million output is attractive — but text‑only support limits use cases.
  • Enterprise teams with compliance requirements can't easily adopt open‑weight models from a Chinese lab, regardless of price. They're locked into the premium cluster.
  • Coding agent builders face the most interesting choice: route different parts of the workflow to different tiers, or stick with a single provider for simplicity? The dual‑routing approach is gaining traction fast.

There's also a durability question. The New Stack argues the gap won't close on its own in the near term: OpenAI will continue releasing fast and pricing up (integrated product as moat), while DeepSeek will continue releasing open weights and pricing down (commodity infrastructure thesis), per The New Stack. The models in between — Claude Sonnet, Gemini Pro, Mistral Large — will face increasing pressure to pick a side.

What to Watch Next

Several developments will determine whether this split becomes permanent:

  • DeepSeek V4 multimodal launch: When V4 adds image, video, and audio support, it becomes a credible replacement for premium models in most coding and research workflows. That's the inflection point.
  • Anthropic's pricing response: Opus 4.7 already sits in the premium cluster at $25/million output. If Anthropic introduces a cheaper tier (Sonnet‑class with V4‑Pro pricing), the middle fills back in.
  • Hybrid routing adoption: If tools like Cursor, Claude Code, and OpenClaw ship built‑in model routing between premium and budget tiers, the two‑cluster market becomes the default architecture rather than a temporary disruption.
  • Enterprise self‑hosting: V4‑Flash's hardware requirements make on‑prem deployment viable. If regulated industries start self‑hosting, it shifts the economics of the entire market.

For now, the smartest move for builders is to design for model flexibility. Build your agent infrastructure so you can route between tiers without rewriting your core logic. The AI middle class may be disappearing, but the builders who adapt fastest will be the ones who benefit most from the split.

Share this article

PostShare

More on This Story