GPT-5.5 Cyberattack Capabilities Match Restricted Claude Mythos in AISI Tests

May 1, 2026

7 min read

TempMail Ninja

GPT-5.5 Cyberattack Capabilities Match Restricted Claude Mythos in AISI Tests

Article Content

The global cybersecurity landscape shifted irrevocably on May 1, 2026, as the UK’s AI Security Institute (AISI) released its most sobering evaluation to date. The report confirms that OpenAI’s newly minted GPT-5.5 has achieved a “critical threshold” in autonomous offensive operations, effectively matching—and in some isolated metrics, exceeding—the performance of Anthropic’s notoriously guarded “Claude Mythos” model. This development has transformed GPT-5.5 cyberattack capabilities from a theoretical risk into a present-tense disruption, sparking a fierce debate over whether restricted access is still a viable defense in an era of model parity.

For months, the industry has operated under a bifurcated security model. On one side stood Anthropic’s “Project Glasswing,” a fortress-like deployment strategy that limited the high-performance Mythos model to roughly 40 vetted organizations, citing it as “too dangerous for public release.” On the other side, OpenAI has opted for a more traditional, tiered public rollout. The AISI’s findings now suggest that the “security through obscurity” wall has been breached—not by a leak, but by the sheer velocity of general-purpose AI advancement. If a model available to the public can autonomously dismantle enterprise-grade security as effectively as a restricted “cyber-weapon,” the very foundations of AI safety policy must be rebuilt.

The Benchmark of Finality: “The Last Ones” (TLO)

To understand the gravity of these findings, one must look at the technical architecture of the AISI’s primary testing range: “The Last Ones” (TLO). Unlike traditional “Capture the Flag” (CTF) exercises that test isolated skills like SQL injection or password cracking, TLO is a 32-step autonomous simulation of a full-scale multi-stage enterprise breach. The simulation environment is a sprawling network consisting of:

Four distinct subnets with varying levels of trust.
Approximately 20 hosts running diverse operating systems (Linux, Windows Server, and specialized RTOS).
A 32-step sequence requiring reconnaissance, lateral movement, credential harvesting, and final data exfiltration.

The AISI estimates that a human security expert would require roughly 20 hours of focused, expert-level effort to complete the chain. GPT-5.5 became only the second model in history to solve TLO end-to-end, achieving a successful “takeover” in 2 out of 10 attempts. While Anthropic’s Claude Mythos maintains a slight edge in reliability (3 out of 10), the AISI noted that GPT-5.5’s “Expert” difficulty CTF success rate of 71.4% actually outperformed Mythos’s 68.6%. For the first time, a model with a public API footprint is operating at a level that could theoretically automate the work of an entire Red Team.

The $1.73 Breach: Redefining Exploit Economics

Perhaps the most startling technical detail in the AISI report is the economic efficiency of these new GPT-5.5 cyberattack capabilities. In one documented case, researchers tasked the model with decoding a stripped Rust binary—a notoriously difficult task for even seasoned reverse engineers due to Rust’s complex memory management and lack of standard symbol information. The model was required to develop a custom disassembler, identify a proprietary VM instruction set, and extract sensitive cryptographic keys.

The result? GPT-5.5 completed the task in 10 minutes and 22 seconds. The total cost in API credits was a mere $1.73. When compared to the $1,500 to $3,000 in labor costs typically associated with a human expert performing the same 12-hour task, the asymmetric threat becomes clear. We are no longer discussing whether AI can hack; we are discussing the total collapse of the cost-of-entry for high-end cyber espionage.

“Persist or Pivot”: The Logic of Long-Horizon Reasoning

The technical leap in GPT-5.5 is not simply a matter of a larger training set. Instead, it stems from a fundamental improvement in what researchers call “Persist or Pivot” logic. Previous models, such as GPT-5.0 or Claude 3.5, often suffered from “recursive collapse” during complex tasks. If an initial exploit path failed, the model would repeatedly attempt the same flawed logic until it ran out of context window.

GPT-5.5 introduces a sophisticated internal auditing mechanism that allows it to recognize dead-end exploit paths twice as fast as its predecessors. According to security firm XBOW, which has already integrated the model into automated penetration testing workflows, the model exhibits a unique behavior: it creates “mental” checkpoints of its progress. If a lateral movement attempt fails, the model can “backtrack” to a previous state of the network map and attempt an entirely different vulnerability, such as shifting from an NTLM relay attack to a zero-day discovery in a legacy service. This “long-horizon reasoning” is what allows the model to bridge the 32 steps of the TLO benchmark without human intervention.

The Emergent “Byproduct” Theory

Critically, the AISI concluded that these hacking capabilities are likely an emergent byproduct of general improvements in coding and reasoning rather than specific malicious training. This finding has profound implications for AI governance. If offensive cyber capabilities are an inevitable shadow of “smarter” AI, then “alignment” in the traditional sense—trying to teach the model not to be “bad”—may be impossible. As long as a model understands how to build a complex, secure system, it inherently understands how to dismantle one. The “dual-use” nature of frontier LLMs is now a baked-in reality of the architecture.

The Ethical Crossfire: Glasswing vs. The Open API

The revelation that GPT-5.5 matches Mythos has reignited a fierce debate over “security through obscurity.” Anthropic’s decision to keep Mythos under the “Project Glasswing” umbrella was predicated on the idea that the model was a unique, singular threat to global stability. However, with OpenAI’s GPT-5.5 now offering comparable power to a much wider audience, critics argue that Anthropic’s restriction is no longer a safety measure—it is a competitive disadvantage for defenders.

The Defender’s Advantage: Proponents of open access argue that since attackers will inevitably find ways to access high-tier models (via “jailbreaking” or state-sponsored development), defenders need immediate, unhindered access to the same tools to automate patching and vulnerability discovery.
The Proliferation Risk: Conversely, safety hardliners argue that OpenAI’s decision to release GPT-5.5 publicly is an act of “corporate negligence.” They point to the fact that while GPT-5.5 failed to solve the “Cooling Tower” (a simulation of an Industrial Control System breach), it was only stopped by the specific IT/OT (Information Technology/Operational Technology) air-gaps in the test environment, not by a lack of fundamental capability.

The AISI report suggests a middle ground: the creation of a “Verified Defender” tier for API access, which OpenAI has begun to implement with its “GPT-5.5 Cyber” rollout. However, the distinction between a “defender” and a “sophisticated attacker” is increasingly blurred in the digital domain.

Technical Depth: Scaling the Inference Wall

For technical professionals, the takeaway from the 2026 AISI report is the correlation between inference compute and exploit success. The data shows that the model’s performance on the TLO benchmark scales almost linearly with the “thinking time” (the amount of compute spent per token generated). GPT-5.5 does not simply “know” the exploit; it searches for it. This shift from pattern matching to active search represents a “System 2” thinking phase for AI agents.

Technical benchmarks included in the report highlight the following:

GraphWalks BFS: GPT-5.5 scored an 82% on the Breadth-First Search test for complex graph structures, essential for mapping unknown enterprise networks.
Unpacking Obfuscation: The model successfully unpacked malware samples that had been obfuscated with three layers of polymorphic encryption in under 4 minutes.
Zero-Day Discovery: While the AISI focused on known vulnerabilities, a secondary test showed GPT-5.5 identifying a memory leak in a 2024 version of the Linux kernel that had remained unpatched for 18 months.

The Paradox of the 6-Year-Old

In a bizarre twist that highlights the current state of AI, the same GPT-5.5 that can execute a $1.73 enterprise breach failed spectacularly on the ARC-AGI-3 reasoning benchmark. Despite its superhuman hacking abilities, the model scored below 1% on tasks that require the fluid, intuitive reasoning of a 6-year-old child. This paradox—superhuman at the terminal, sub-human at abstract logic—suggests that our current metrics for “intelligence” are deeply flawed.

We are entering an era where an AI can be a “savant” of the shell—a tool capable of identifying 27-year-old bugs (as Mythos famously did with OpenBSD) while simultaneously being unable to solve a simple visual puzzle. For the cybersecurity industry, this means the threat is highly specialized and incredibly potent, even if the “AGI” dream remains out of reach.

Conclusion: Living in a Post-Obscurity World

The UK AI Security Institute’s evaluation of GPT-5.5 cyberattack capabilities marks the end of the “early access” era of AI safety. When two separate entities, OpenAI and Anthropic, reach the same devastating threshold within weeks of each other, it is clear that the capability is a feature of the current technological paradigm, not a fluke of training.

As we move further into 2026, the focus must shift from who has the model to how we defend against it. With the cost of a 20-hour human exploit now hovering around the price of a cup of coffee, the “defender’s advantage” can only be reclaimed through the same level of automation. The era of the human-led security operations center (SOC) is drawing to a close; the era of the AI-on-AI cyberwar has begun.

TempMail Ninja

Digital privacy and online security expert. Passionate about creating tools that protect users' identity on the internet.

GPT-5.5 Cyberattack Capabilities Match Restricted Claude Mythos in AISI Tests

Article Content

The Benchmark of Finality: “The Last Ones” (TLO)

The $1.73 Breach: Redefining Exploit Economics

“Persist or Pivot”: The Logic of Long-Horizon Reasoning

The Emergent “Byproduct” Theory

The Ethical Crossfire: Glasswing vs. The Open API

Technical Depth: Scaling the Inference Wall

The Paradox of the 6-Year-Old

Conclusion: Living in a Post-Obscurity World

Tags

TempMail Ninja

You might also like

GPT-5.6 Series Release: OpenAI Announces Public Launch of Sol, Terra, and Luna

GPT-Live: OpenAI Launches Real-Time Full-Duplex Voice Conversations

Gemini 3.5 Pro Launch Delayed: DeepMind Rebuilds Architecture for July 17 Release