Updated May 9

Developer Tools

OpenAI Ships GPT-Realtime-2 — A Voice Model That Reasons Inside the Audio Loop

OpenAI launched GPT‑Realtime‑2 and two companion voice models on May 7, 2026. The flagship brings GPT‑5‑class reasoning to live voice with 128K context window.

The Launch

On May 7, 2026, OpenAI dropped three new voice models: GPT‑Realtime‑2, GPT‑Realtime‑Translate, and GPT‑Realtime‑Whisper. The flagship brings GPT‑5‑class reasoning running inside the audio loop — not bolted on between transcription and synthesis. TechCrunch reported OpenAI's goal: "move real‑time audio from simple call‑and‑response toward voice interfaces that can actually do work."

What's New in GPT‑Realtime‑2

The context window jumps from 32K to 128K tokens. Reasoning effort is configurable across five levels. OpenAI's developer docs confirm production features that teams previously built through middleware are now first‑class:

Preambles: "Let me check that" while calling tools
Parallel tool calls: Fire multiple backend requests simultaneously
Recovery behavior: Graceful failure handling without freezing
Tone control: Deliberate tone adjustment for context

The Next Web noted reasoning happens inside the audio loop — an architectural difference from stitched‑together stacks.

Companion Models: Translate and Whisper

GPT‑Realtime‑Translate handles 70+ input languages and outputs into 13 languages in real time. GPT‑Realtime‑Whisper provides streaming speech‑to‑text. OpenAI's launch blog positions these for live captions, meeting notes, and continuous voice‑agent understanding. Both billed by the minute.

Benchmarks and Early Results

GPT‑Realtime‑2 scored 15.2% higher than Realtime‑1.5 on Big Bench Audio and 13.8% higher on Audio MultiChallenge. The Next Web reported customer results: Zillow saw a 26‑point lift in call‑success rate (69% to 95%). BolnaAI reported 12.5% lower word error rates on Hindi, Tamil, and Telugu.

Pricing: How It Stacks Up

GPT‑Realtime‑2: $32/1M audio input tokens, $64/1M output, $0.40/1M cached (~$0.048/min raw). Translate at $0.034/min and Whisper at $0.017/min are priced aggressively. Deepgram's Voice Agent API runs $4.50/hr ($0.075/min). ElevenLabs ElevenAgents charges $0.080/min (burst: $0.160/min). A typical multi‑vendor voice stack runs $0.10-$0.25/min. OpenAI's single‑model approach aims below that floor.

What This Means for Voice AI Stacks

Until now, production voice agents ran on a multi‑vendor stack: Whisper/Deepgram for transcription, ElevenLabs/Cartesia for TTS, GPT‑4/Claude for reasoning, plus custom orchestration. ² described GPT‑Realtime‑2 as a direct replacement. Teams optimizing for latency, simplicity, and cost at scale now have a single‑vendor option. TechCrunch noted built‑in safety guardrails halt conversations that violate content guidelines.

Sources

1.TechCrunch(techcrunch.com)
2.The Next Web(thenextweb.com)

More on This Story

May 29, 2026

Anthropic to Widely Release Mythos-Level AI Models Within Weeks, 7 Weeks After Deeming Them Too Dangerous

Anthropic announced Thursday it plans to widely release Mythos-level AI models — capable of autonomously finding and exploiting zero-day vulnerabilities across every major operating system and browser — just seven weeks after deeming the technology too dangerous for public access. The company says it has made swift progress on safety safeguards, but developers and cybersecurity experts remain deeply unsettled.

anthropicmythosai-safety

May 29, 2026

CNN Sues Perplexity AI, Alleging Mass Copyright Infringement

CNN filed a lawsuit against Perplexity AI in New York federal court, accusing the AI search company of unlawfully copying and distributing thousands of CNN stories, videos, and images without permission. The case joins a growing wave of publisher lawsuits against AI companies over content use.

cnnperplexity-aicopyright

Related News

May 29, 2026

Anthropic Hits $965B Valuation to Top OpenAI, Drops Claude Opus 4.8

Anthropic raised $65 billion in Series H funding at a $965 billion valuation, officially surpassing OpenAI as the world's most valuable AI startup. The company also released Claude Opus 4.8, its 'most honest' model yet, alongside new developer tools including dynamic workflows and effort control.

anthropicanthropic-valuationclaude-opus-4-8

May 28, 2026

OpenAI Foundation Pledges $250M to Help Workers Navigate AI Disruption

OpenAI nonprofit arm is committing an initial $250 million to study AI economic impact, support displaced workers, and explore new models for distributing AI gains.

openaiopenai-foundationai-jobs

May 28, 2026

Anthropic Revenue Hits $45B ARR, Surpasses OpenAI by 35% Ahead of IPOs

Anthropic's annualized revenue has reached nearly $45 billion — 35% higher than OpenAI's $33 billion ARR — as both companies race toward IPOs later this year. The revenue gap comes alongside Anthropic's first-ever quarterly operating profit of $559 million.

anthropicopenairevenue

OpenAI Ships GPT-Realtime-2 — A Voice Model That Reasons Inside the Audio Loop

The Launch

What's New in GPT‑Realtime‑2

Companion Models: Translate and Whisper

Benchmarks and Early Results

Pricing: How It Stacks Up

What This Means for Voice AI Stacks

Sources

Tags

Share this article

More on This Story

Anthropic to Widely Release Mythos-Level AI Models Within Weeks, 7 Weeks After Deeming Them Too Dangerous

CNN Sues Perplexity AI, Alleging Mass Copyright Infringement

Related News

Anthropic Hits $965B Valuation to Top OpenAI, Drops Claude Opus 4.8

OpenAI Foundation Pledges $250M to Help Workers Navigate AI Disruption

Anthropic Revenue Hits $45B ARR, Surpasses OpenAI by 35% Ahead of IPOs