OpenAI Bug Bounty
OpenAI Offers $25,000 for Biosafety Jailbreaks in GPT-5.5 Bug Bounty
OpenAI launched a Bio Bug Bounty for GPT‑5.5, offering $25,000 to researchers who find a universal jailbreak that defeats five biosafety questions. But the invite‑only program, strict NDA, and low payout have drawn fierce criticism from the security community.
What the GPT‑5.5 Bio Bug Bounty Actually Does
OpenAI has launched a Bio Bug Bounty for GPT‑5.5, inviting security researchers to find a universal jailbreak that bypasses the model's biosafety safeguards. Announced April 23 on OpenAI's blog, the program challenges participants to craft a single prompt that defeats all five undisclosed bio‑safety questions without triggering moderation.
The scope is narrow: only GPT‑5.5 running in Codex Desktop is in scope. The five target questions remain secret — applicants don't know what they're trying to bypass until they're accepted into the program. The reward is $25,000 for the first successful universal jailbreak, with smaller discretionary awards for partial results.
How the Program Works
The bug bounty operates on a strict timeline and access model:
- Application period: April 23 – June 22, 2026 (rolling acceptances)
- Testing period: April 28 – July 27, 2026
- Access: Invite‑and‑application — OpenAI extends invitations to a "vetted list of trusted bio red‑teamers" and reviews new applications
- NDA required: All prompts, completions, findings, and communications are covered by non‑disclosure agreement
- Single payout: Only the first universal jailbreak earns the full bounty
According to Heise Online, this builds on OpenAI's earlier Safety Bug Bounty launched in March 2026, which focused on agentic risks and data exposure. The Bio Bug Bounty is specifically scoped to biological content safeguards, a domain that has grown more sensitive since OpenAI signed a Pentagon contract.
The GPT‑5.5 Safety Benchmarks Behind the Bounty
The bounty exists because GPT‑5.5 represents a meaningful capability jump. According to OpenAI's Deployment Safety Hub, the GPT‑5.5 System Card rates the model as "High" capability in the biological and chemical domain — the same rating as GPT‑5.4. The model was subjected to full predeployment safety evaluations, the Preparedness Framework, targeted red‑teaming for cybersecurity and biology, and feedback from roughly 200 early‑access partners.
Key safety metrics from the system card:
- Violent illicit behavior refusal: 97.9% (up from 97.1% on GPT‑5.4‑Thinking)
- Harassment refusal: 82.2% (up from 79.0%)
- Destructive action avoidance: 0.90 (up from 0.86 on GPT‑5.4‑Thinking)
- Prompt injection resistance: 0.963 (slight regression from GPT‑5.4's 0.998)
OpenAI describes GPT‑5.5 as shipping with its "strongest set of safeguards to date." The bug bounty is designed to test whether those safeguards hold up against determined adversaries.
Why the Security Community Is Pushing Back
The Hacker News discussion of the bounty garnered 154 points and 104 comments, and the verdict was near‑unanimous: the program is structured to discourage serious participation. Several specific criticisms emerged from the HN thread:
- Low payout: $25,000 is 1/20th of OpenAI's previous Kaggle red‑teaming competition, which paid $500,000. OpenAI reportedly makes ~$65M/day — the bounty equals roughly 33 seconds of revenue. As one commenter put it: "If it's an existential threat to humanity, and if OpenAI is valued at nearly $1T, why set the bounty at a measly $25k?"
- Restrictive NDA: Participants can't publish findings regardless of payout outcome. If OpenAI rejects a claim, the NDA still silences the submitter. This makes participation meaningless for resume or portfolio building.
- Gatekept access: Only vetted red‑teamers get invitations, though applications are accepted The best hackers may not be on any vetted list, and gatekeeping incentivizes non‑approved researchers to sell exploits elsewhere.
- Single winner: Only one person gets paid, regardless of how many unique vulnerabilities are found.
The PR Strategy Theory
Multiple commenters converged on a theory that the bounty is primarily a PR exercise. A highly‑upvoted comment by user chromacity laid out three goals:
1) Underscores to the general public that the models are amazingly powerful and if you're not using them, your competitors will out‑innovate you. 2) Sends the message to regulators that they don't need to do anything because the companies are diligent to prevent harm. 3) Sends the message to regulators that they sure should be regulating open‑source models, because these hippies are not doing rigorous safety testing.
The logic: low bounty + restrictive terms = designed to discourage serious participation, potentially allowing OpenAI to claim "nobody broke it" and use that as evidence of safety. Heise Online noted the parallel with OpenAI's previous claim that GPT‑2 was "too dangerous to release" — a pattern of exaggerated risk narratives.
The Over‑Filtering Problem for Builders
For developers building with OpenAI's API, the biosafety filters are a double‑edged sword. HN commenters reported legitimate research being blocked by overzealous safety classifiers:
- Code to analyze SARS‑CoV‑2 sequences for breakpoints (pure research) flagged for what the model classified as biosafety concerns, per Hacker News reports
- Anonymized ecommerce risk assessment blocked over gender‑related variable concerns
- Gene drive illustrations for high school audiences triggering biosafety flags
- Medical device reverse engineering blocked by Anthropic's CBRN filter
One commenter summarized the tension: after all safeguards are applied, the model effectively becomes unusable, according to Hacker News commenters for legitimate bio‑related work. The bug bounty may strengthen defenses against malicious use, but developers worry it could also tighten filters further, making it harder to build products in healthcare, biotech, and related domains.
What This Means for Developers
The Bio Bug Bounty is the latest signal that AI safety testing is becoming a public, structured process — but one controlled entirely by the companies building the models. For builders, there are three takeaways:
- If you work in bio or healthcare: Expect filters to tighten, not loosen. Budget time for prompt engineering workarounds and consider whether self‑hosted models (like DeepSeek V4) might offer more flexibility for legitimate use cases.
- If you're a security researcher: The program pays less than most corporate bug bounties for far more specialized work. Factor in the NDA restrictions before applying.
- If you're evaluating AI providers: OpenAI is effectively asking the community to validate its safety claims — but on terms that limit independent verification. That's worth weighing when choosing between OpenAI and alternatives like Anthropic's own safety programs or open‑weight models with community‑driven testing.
Apr 27, 2026
AI Model Market Splits as OpenAI Doubles Prices and DeepSeek Undercuts
In 24 hours, OpenAI doubled GPT-5.5 pricing while DeepSeek launched V4 at one-ninth the cost. The comfortable middle tier of AI models is vanishing, forcing developers to choose between premium integrated stacks and cheap open-weight alternatives. Here's what the split means for builders.
Apr 27, 2026
Claude Managed Agents Get Persistent Memory in Public Beta
Anthropic has launched persistent memory for Claude Managed Agents in public beta, enabling AI agents to learn across sessions. Early adopters like Rakuten report 97% fewer errors and 27% lower costs. Here's how the filesystem-based memory layer works and what it means for builders.
Apr 26, 2026
Perplexity AI Hit With Privacy Lawsuit While Appeals Court Greenlights Its Bots
Perplexity faces a class-action lawsuit for allegedly sharing user chat data with Google and Meta through hidden trackers. Meanwhile, the Ninth Circuit has paused an injunction blocking its Comet shopping bot from Amazon — setting the stage for a landmark ruling on whether AI agents need user permission or platform permission to operate.