Developer Tools
OpenAI Ships GPT-Realtime-2 — A Voice Model That Reasons Inside the Audio Loop
OpenAI launched GPT‑Realtime‑2 and two companion voice models on May 7, 2026. The flagship brings GPT‑5‑class reasoning to live voice with 128K context window.
The Launch
On May 7, 2026, OpenAI dropped three new voice models: GPT‑Realtime‑2, GPT‑Realtime‑Translate, and GPT‑Realtime‑Whisper. The flagship brings GPT‑5‑class reasoning running inside the audio loop — not bolted on between transcription and synthesis. TechCrunch reported OpenAI's goal: "move real‑time audio from simple call‑and‑response toward voice interfaces that can actually do work."
What's New in GPT‑Realtime‑2
The context window jumps from 32K to 128K tokens. Reasoning effort is configurable across five levels. OpenAI's developer docs confirm production features that teams previously built through middleware are now first‑class:
- Preambles: "Let me check that" while calling tools
- Parallel tool calls: Fire multiple backend requests simultaneously
- Recovery behavior: Graceful failure handling without freezing
- Tone control: Deliberate tone adjustment for context
The Next Web noted reasoning happens inside the audio loop — an architectural difference from stitched‑together stacks.
Companion Models: Translate and Whisper
GPT‑Realtime‑Translate handles 70+ input languages and outputs into 13 languages in real time. GPT‑Realtime‑Whisper provides streaming speech‑to‑text. OpenAI's launch blog positions these for live captions, meeting notes, and continuous voice‑agent understanding. Both billed by the minute.
Benchmarks and Early Results
GPT‑Realtime‑2 scored 15.2% higher than Realtime‑1.5 on Big Bench Audio and 13.8% higher on Audio MultiChallenge. The Next Web reported customer results: Zillow saw a 26‑point lift in call‑success rate (69% to 95%). BolnaAI reported 12.5% lower word error rates on Hindi, Tamil, and Telugu.
Pricing: How It Stacks Up
GPT‑Realtime‑2: $32/1M audio input tokens, $64/1M output, $0.40/1M cached (~$0.048/min raw). Translate at $0.034/min and Whisper at $0.017/min are priced aggressively. Deepgram's Voice Agent API runs $4.50/hr ($0.075/min). ElevenLabs ElevenAgents charges $0.080/min (burst: $0.160/min). A typical multi‑vendor voice stack runs $0.10-$0.25/min. OpenAI's single‑model approach aims below that floor.
What This Means for Voice AI Stacks
Until now, production voice agents ran on a multi‑vendor stack: Whisper/Deepgram for transcription, ElevenLabs/Cartesia for TTS, GPT‑4/Claude for reasoning, plus custom orchestration. 2 described GPT‑Realtime‑2 as a direct replacement. Teams optimizing for latency, simplicity, and cost at scale now have a single‑vendor option. TechCrunch noted built‑in safety guardrails halt conversations that violate content guidelines.
Sources
- 1.TechCrunch(techcrunch.com)
- 2.The Next Web(thenextweb.com)
Jun 12, 2026
Perplexity Moves Deep Research Into Computer, Routing Tasks Across 20+ AI Models
Perplexity has moved its Deep Research capability into Computer, its multi-model orchestration system that breaks complex questions into subtasks and routes them across 20+ frontier AI models. The upgrade produces work-ready reports, decks, and dashboards.
Jun 12, 2026
GPT-5.5 Beats Claude Fable 5 on Brutal New Agents' Last Exam Benchmark
OpenAI's GPT-5.5 beat Anthropic's brand-new Claude Fable 5 on the Agents' Last Exam benchmark, a grueling new test from UC Berkeley that measures whether AI can execute real, economically valuable professional workflows — and both models still fail most of the time.
Related News
Jun 12, 2026
OpenAI Buys Cloud Startup Ona So Codex Can Run Tasks While Your Laptop Is Closed
OpenAI is acquiring German cloud startup Ona to give Codex persistent cloud environments where AI agents can run multi-step coding tasks across hours or days — even when your laptop is shut. Codex now has over 5 million weekly active users.
Jun 9, 2026
OpenAI Files for IPO a Week After Anthropic as AI Giants Race to Wall Street
OpenAI confidentially filed for an IPO just one week after rival Anthropic.
Jun 8, 2026
The Tokenpocalypse Is Here: Copilot Bills Jump 25x as AI Pricing Reckoning Begins
GitHub Copilot's switch to token-based billing triggered bills jumping from $29 to $750 overnight for some developers. But the'Tokenpocalypse' is bigger than one product — it signals the end of VC-subsidized AI and a pricing reckoning that will reshape how every developer builds.