Updated Apr 27

OpenAI Bug Bounty

OpenAI Offers $25,000 for Biosafety Jailbreaks in GPT-5.5 Bug Bounty

OpenAI launched a Bio Bug Bounty for GPT‑5.5, offering $25,000 to researchers who find a universal jailbreak that defeats five biosafety questions. But the invite‑only program, strict NDA, and low payout have drawn fierce criticism from the security community.

What the GPT‑5.5 Bio Bug Bounty Actually Does

OpenAI has launched a Bio Bug Bounty for GPT‑5.5, inviting security researchers to find a universal jailbreak that bypasses the model's biosafety safeguards. Announced April 23 on OpenAI's blog, the program challenges participants to craft a single prompt that defeats all five undisclosed bio‑safety questions without triggering moderation.

The scope is narrow: only GPT‑5.5 running in Codex Desktop is in scope. The five target questions remain secret — applicants don't know what they're trying to bypass until they're accepted into the program. The reward is $25,000 for the first successful universal jailbreak, with smaller discretionary awards for partial results.

How the Program Works

The bug bounty operates on a strict timeline and access model:

Application period: April 23 – June 22, 2026 (rolling acceptances)
Testing period: April 28 – July 27, 2026
Access: Invite‑and‑application — OpenAI extends invitations to a "vetted list of trusted bio red‑teamers" and reviews new applications
NDA required: All prompts, completions, findings, and communications are covered by non‑disclosure agreement
Single payout: Only the first universal jailbreak earns the full bounty

According to Heise Online, this builds on OpenAI's earlier Safety Bug Bounty launched in March 2026, which focused on agentic risks and data exposure. The Bio Bug Bounty is specifically scoped to biological content safeguards, a domain that has grown more sensitive since OpenAI signed a Pentagon contract.

The GPT‑5.5 Safety Benchmarks Behind the Bounty

The bounty exists because GPT‑5.5 represents a meaningful capability jump. According to OpenAI's Deployment Safety Hub, the GPT‑5.5 System Card rates the model as "High" capability in the biological and chemical domain — the same rating as GPT‑5.4. The model was subjected to full predeployment safety evaluations, the Preparedness Framework, targeted red‑teaming for cybersecurity and biology, and feedback from roughly 200 early‑access partners.

Key safety metrics from the system card:

Violent illicit behavior refusal: 97.9% (up from 97.1% on GPT‑5.4‑Thinking)
Harassment refusal: 82.2% (up from 79.0%)
Destructive action avoidance: 0.90 (up from 0.86 on GPT‑5.4‑Thinking)
Prompt injection resistance: 0.963 (slight regression from GPT‑5.4's 0.998)

OpenAI describes GPT‑5.5 as shipping with its "strongest set of safeguards to date." The bug bounty is designed to test whether those safeguards hold up against determined adversaries.

Why the Security Community Is Pushing Back

The Hacker News discussion of the bounty garnered 154 points and 104 comments, and the verdict was near‑unanimous: the program is structured to discourage serious participation. Several specific criticisms emerged from the HN thread:

Low payout: $25,000 is 1/20th of OpenAI's previous Kaggle red‑teaming competition, which paid $500,000. OpenAI reportedly makes ~$65M/day — the bounty equals roughly 33 seconds of revenue. As one commenter put it: "If it's an existential threat to humanity, and if OpenAI is valued at nearly $1T, why set the bounty at a measly $25k?"
Restrictive NDA: Participants can't publish findings regardless of payout outcome. If OpenAI rejects a claim, the NDA still silences the submitter. This makes participation meaningless for resume or portfolio building.
Gatekept access: Only vetted red‑teamers get invitations, though applications are accepted The best hackers may not be on any vetted list, and gatekeeping incentivizes non‑approved researchers to sell exploits elsewhere.
Single winner: Only one person gets paid, regardless of how many unique vulnerabilities are found.

The PR Strategy Theory

Multiple commenters converged on a theory that the bounty is primarily a PR exercise. A highly‑upvoted comment by user chromacity laid out three goals:

1) Underscores to the general public that the models are amazingly powerful and if you're not using them, your competitors will out‑innovate you. 2) Sends the message to regulators that they don't need to do anything because the companies are diligent to prevent harm. 3) Sends the message to regulators that they sure should be regulating open‑source models, because these hippies are not doing rigorous safety testing.

The logic: low bounty + restrictive terms = designed to discourage serious participation, potentially allowing OpenAI to claim "nobody broke it" and use that as evidence of safety. Heise Online noted the parallel with OpenAI's previous claim that GPT‑2 was "too dangerous to release" — a pattern of exaggerated risk narratives.

The Over‑Filtering Problem for Builders

For developers building with OpenAI's API, the biosafety filters are a double‑edged sword. HN commenters reported legitimate research being blocked by overzealous safety classifiers:

Code to analyze SARS‑CoV‑2 sequences for breakpoints (pure research) flagged for what the model classified as biosafety concerns, per Hacker News reports
Anonymized ecommerce risk assessment blocked over gender‑related variable concerns
Gene drive illustrations for high school audiences triggering biosafety flags
Medical device reverse engineering blocked by Anthropic's CBRN filter

One commenter summarized the tension: after all safeguards are applied, the model effectively becomes unusable, according to Hacker News commenters for legitimate bio‑related work. The bug bounty may strengthen defenses against malicious use, but developers worry it could also tighten filters further, making it harder to build products in healthcare, biotech, and related domains.

What This Means for Developers

The Bio Bug Bounty is the latest signal that AI safety testing is becoming a public, structured process — but one controlled entirely by the companies building the models. For builders, there are three takeaways:

If you work in bio or healthcare: Expect filters to tighten, not loosen. Budget time for prompt engineering workarounds and consider whether self‑hosted models (like DeepSeek V4) might offer more flexibility for legitimate use cases.
If you're a security researcher: The program pays less than most corporate bug bounties for far more specialized work. Factor in the NDA restrictions before applying.
If you're evaluating AI providers: OpenAI is effectively asking the community to validate its safety claims — but on terms that limit independent verification. That's worth weighing when choosing between OpenAI and alternatives like Anthropic's own safety programs or open‑weight models with community‑driven testing.

Sources

1.Hacker News(news.ycombinator.com)

More on This Story

May 20, 2026

Google Fires Back at Anthropic Mythos With CodeMender Security Agent

Google announced CodeMender API access at I/O 2026, positioning its AI code-security agent as a direct response to Anthropic's Mythos. The move signals that cybersecurity — not chatbots — is becoming the key revenue battleground for frontier AI labs racing toward IPOs.

googleanthropicmythos

May 20, 2026

Jury Rejects Musk OpenAI Lawsuit as Statute of Limitations Expires

A federal jury unanimously dismissed Elon Musk's lawsuit against OpenAI and Sam Altman, ruling it was filed too late. The verdict clears a major legal hurdle for OpenAI's IPO — but the trial exposed Musk's own plans to turn OpenAI into a for-profit company years earlier.

elon-muskopenaisam-altman

May 20, 2026

Andrej Karpathy Joins Anthropic as OpenAI Co-Founding Member Defects

Andrej Karpathy, one of OpenAI original 11 co-founders and former Tesla AI director, has joined Anthropic pretraining team to lead a new group focused on using Claude to accelerate AI research itself.

andrej-karpathyanthropicopenai

Related News

Apr 30, 2026

Tumbler Ridge Families Sue OpenAI for $1B+ Over ChatGPT Shooting Role

Seven lawsuits filed in San Francisco allege OpenAI's safety team flagged the Tumbler Ridge shooter's plans but leadership overruled reporting to police. The cases could reshape legal accountability for AI companies whose chatbots interact with dangerous users.

openai-lawsuittumbler-ridge-shootingai-liability

Apr 26, 2026

OpenAI Knew About Shooting Suspects and Said Nothing. Now Altman Is Sorry

OpenAI flagged and banned a mass shooter's ChatGPT account eight months before the Tumbler Ridge attack but chose not to alert police. After a second shooting involving ChatGPT advice, Florida has opened a criminal investigation. The era of AI companies operating without a duty to report is ending.

openai-altman-apologychatgpt-shootingai-duty-to-report

OpenAI Offers $25,000 for Biosafety Jailbreaks in GPT-5.5 Bug Bounty

What the GPT‑5.5 Bio Bug Bounty Actually Does

How the Program Works

The GPT‑5.5 Safety Benchmarks Behind the Bounty

Why the Security Community Is Pushing Back

The PR Strategy Theory

The Over‑Filtering Problem for Builders

What This Means for Developers

Sources

Tags

Share this article

More on This Story

Google Fires Back at Anthropic Mythos With CodeMender Security Agent

Jury Rejects Musk OpenAI Lawsuit as Statute of Limitations Expires

Andrej Karpathy Joins Anthropic as OpenAI Co-Founding Member Defects

Related News

Tumbler Ridge Families Sue OpenAI for $1B+ Over ChatGPT Shooting Role

OpenAI Knew About Shooting Suspects and Said Nothing. Now Altman Is Sorry