Updated 6 hours ago
Anthropic to Widely Release Mythos-Level AI Models Within Weeks, 7 Weeks After Deeming Them Too Dangerous

Anthropic Mythos

Anthropic to Widely Release Mythos-Level AI Models Within Weeks, 7 Weeks After Deeming Them Too Dangerous

Anthropic announced Thursday it plans to widely release Mythos‑level AI models — capable of autonomously finding and exploiting zero‑day vulnerabilities across every major operating system and browser — just seven weeks after deeming the technology too dangerous for public access. The company says it has made swift progress on safety safeguards, but developers and cybersecurity experts remain deeply unsettled.

Glowing keyhole with a hand reaching toward it, symbolizing the controlled release of powerful Mythos-level AI models
AI-generated editorial illustration (Gemini 3 Pro)

The 7‑Week Pivot

On April 7, 2026, Anthropic dropped a bombshell: its new Claude Mythos Preview model was too dangerous to release to the public. The UK's AI Security Institute (AISI) had just completed an evaluation finding the model could autonomously discover and exploit zero-day vulnerabilities across every major operating system and web browser — capabilities the company said required extraordinary restraint. Mythos would be restricted to a handpicked group of tech giants through "Project Glasswing," the BBC reported.

Seven weeks later, that posture has evaporated. On May 28, Anthropic announced it had made "swift progress" on safety safeguards and expects to "bring Mythos‑class models to all our customers in the coming weeks," TechCrunch first reported. The reversal came embedded in the launch announcement for Claude Opus 4.8, a new flagship model positioned as a bridge to the Mythos era — and, critically, as a demonstration that less capable models can be made honest enough to be safe.

The timing is inescapable. Earlier that same day, The Guardian revealed that Anthropic had just closed a $65 billion funding round at a $965 billion post‑money valuation, eclipsing OpenAI to become the world's most valuable AI firm. Few in the industry believe the proximity of that valuation to the Mythos safety reversal is coincidental.

What Mythos Actually Does

The capabilities that spooked regulators in April are not hypothetical. In controlled evaluations by the UK's AISI, Mythos Preview achieved a 73% success rate on expert‑level capture‑the‑flag challenges — tasks that no model could complete at all before April 2025. More alarmingly, it became the first model ever to complete "The Last Ones," a 32‑step corporate network attack simulation that human professionals estimate would take roughly 20 hours. Mythos solved it end‑to‑end in 3 out of 10 attempts, according to the AISI evaluation.

The model doesn't just find bugs — it autonomously writes sophisticated exploits. Its output includes ROP chains, JIT heap sprays, KASLR bypasses, and multi‑stage sandbox escapes. In one notable case, it discovered a vulnerability that had lain dormant in a critical piece of security infrastructure for 27 years. "Mythos Preview has already found thousands of high‑severity vulnerabilities, including some in every major operating system and web browser," Anthropic stated at launch, as covered by the BBC.

Former UK National Cyber Security Centre head Ciaran Martin described the situation in stark terms: the model represents an "evolutionary step" in autonomous cyber offense, and current testing environments feature "near‑nonexistent defenses." Martin told the BBC that whether Mythos itself or subsequent models from Anthropic or its rivals, the underlying vulnerabilities of the internet now face a qualitatively new threat. The ability of non‑experts to generate working exploits overnight using Mythos‑level models fundamentally changes the attack surface, The Guardian noted in its coverage.

The Safety Sprint

Anthropic's seven‑week safety sprint centered on making models more honest about their limitations — a property the company now characterizes as the critical safeguard. Opus 4.8, released alongside the Mythos announcement, was explicitly designed to address this. The model scored 69.2% on SWE‑Bench Pro (up from Opus 4.7's 64.3%) and hit 1890 on GDPval, a benchmark measuring economically viable work completion. But the headline metric was behavioral: early testers reported the model was "more likely to flag uncertainties about its work and less likely to make unsupported claims," Inc. Magazine reported.

Bridgewater Associates, among the early testers, said Opus 4.8's defining upgrade was its "tendency to proactively flag issues with the inputs and outputs of an analysis, something other models routinely missed and left to the users to catch." Niko Grupen, head of applied research at legal AI company Harvey, said the model reached "the highest score ever recorded" on the firm's internal legal agent benchmark, bringing "the kind of accuracy lift that translates directly into how much real attorney work our customers can hand off with confidence," according to Inc.

But "honesty" as a safety mechanism for models that autonomously generate zero-day exploits raises uncomfortable questions. The AISI itself noted that its testing environments lacked active defenders and real‑time monitoring — meaning Mythos performed its attacks against networks with no one watching. In the real world, detecting an AI‑driven multi‑stage intrusion that completes in hours what takes humans weeks remains an unsolved problem.

Builders React: 'Creeped Out' and Pushed Aside

Among the developers who would be expected to wield these tools, the mood is not celebratory. At recent Claude Code workshops in London, builders told Bloomberg's Parmy Olson they were becoming "increasingly concerned" about AI models being given high levels of autonomy. Developers described being "pushed out of the programming process," reduced to watching opaque AI agents generate code over hours or even days with no visibility into the chain of thought driving the output, Futurism reported.

Claude Code head of product Cat Wu acknowledged the communication gap, telling Bloomberg the system was "incredibly secure" and characterizing the unease as a problem of "insufficient communication, not a lack of controls." That framing has done little to reassure. The latest iterations of Claude Code no longer display text describing their ongoing chain of thought, further obfuscating the decision‑making of agents that can now orchestrate "codebase‑scale migrations across hundreds of thousands of lines of code from kickoff to merge," as TechCrunch detailed.

A parallel concern is skill atrophy. As 404 Media reported earlier this month, software developers across the industry are observing their peers rapidly losing technical skills due to over‑reliance on AI coding assistants. If Mythos‑class models make exploit‑generation accessible to non‑experts while simultaneously deskilling the defender workforce, the asymmetry in cybersecurity could widen dramatically.

The OpenAI Comparison

Perhaps the most telling context for Anthropic's reversal is what happened — or rather, didn't happen — when OpenAI crossed the same threshold first. GPT‑5.3‑Codex, released in February 2026, was classified by OpenAI as "High capability for cybersecurity‑related tasks" under its Preparedness Framework. It crossed the same "high cybersecurity capability" threshold that Anthropic would later deem too dangerous for public release — roughly seven weeks earlier. OpenAI's response was not a restricted preview or a safety media campaign. It was Trusted Access for Cyber, a graduated access system, documented in its system card.

The contrast exposes the strategic dimension of Anthropic's safety rhetoric. Anthropic co‑founder Chris Olah traveled to the Vatican this month to speak at the release of Pope Leo's first encyclical — which was, notably, about AI and its risks. Olah told the Pope his team kept discovering "unsettling" things inside their models, Futurism reported. The company has poured millions into lobbying and Super PACs aimed at candidates who support AI regulation that would, conveniently, constrain competitors racing to catch up.

When OpenAI crossed the cybersecurity capability line without fanfare, it demonstrated that the threat could arrive with zero warning. Anthropic's more theatrical approach — declare the models dangerous, restrict them, then reverse course after a $65 billion raise — looks less like a safety protocol and more like a valuation strategy.

What This Means for Cyber Defense

The imminent wide release of Mythos‑level models forces a reckoning across three fronts: patching pipelines, cyber defense economics, and the regulatory vacuum.

On patching, the math has shifted permanently. Vulnerabilities that Mythos finds autonomously are the same ones that enterprise security teams have failed to patch for years — sometimes decades. The 27‑year‑old OpenBSD bug and the Linux kernel vulnerabilities discovered during the preview period are not anomalies; they are representative of the backlog that AI‑driven attackers will now exploit at machine speed. "AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure," Cisco's Anthony Grieco said at the Glasswing launch, as reported by The Guardian.

The defense economics may be even more lopsided. Running Mythos‑class models at inference scale is expensive — the AISI used 100 million token budgets in its evaluations, and performance continued to scale with more compute. But attackers only need to succeed once, while defenders must succeed every time. Lee Klarich of Palo Alto Networks warned the model "signals a dangerous shift" and that "there will be more attacks, faster attacks and more sophisticated attacks," as quoted by Anthropic. Meanwhile, the UK's NCSC continues to emphasize that basic cybersecurity hygiene — patching, access controls, logging — remains the most effective defense, even against AI‑assisted attackers.

On regulation, the vacuum is acute. Anthropic is still locked in a legal battle with the Pentagon after refusing to remove safeguards that would allow Claude to be used for mass domestic surveillance or lethal autonomous weapons. The Trump administration has publicly called the company a "radical left, woke company" according to Politico, while simultaneously relying on its technology. This contradictory posture has left exactly zero binding regulation in place as the most cyber‑capable AI models in history prepare for general release.

Share this article

PostShare

More on This Story

Related News