Cybersecurity • Saturday, 16 May 2026

The Security Model That's Getting More Capable Faster Than Its Makers Expected

By AI Daily Editorial • Saturday, 16 May 2026

Anthropic's Mythos, the AI model the company decided was too capable to release to the public, has already gotten meaningfully more capable since its restricted launch last month. That progression, documented this week by the UK's AI Security Institute, is the clearest signal yet that the pace of improvement, not just the level of capability, is the thing to watch.

When Anthropic announced Mythos Preview in April, it did something unusual: it warned that its own product was dangerous. The model, trained to find and exploit software vulnerabilities, was good enough at that task that releasing it broadly could hand hackers a powerful new tool. Access was restricted to a small group of trusted partners through something Anthropic called Project Glasswing. The UK AISI tested it at launch and found it already represented a step up over previous frontier models.

This week, AISI tested a newer checkpoint of the same Mythos model. The results were striking. The updated version completed both of AISI's cyber test ranges, including "Cooling Tower," a scenario that no model had previously solved. It did so in three out of ten attempts. The first version had already cleared the other range; this one cleared both. All of this happened within a single model version, before any new major release.

AISI's assessment of the broader trend is sobering. In February, the institute estimated that the length of cyber tasks AI models could reliably complete had been doubling every 4.7 months since late 2024. That was already an acceleration from its November 2025 estimate of eight months. Mythos and OpenAI's GPT-5.5, also tested this week, both exceeded those doubling-rate projections. Whether this represents a new baseline or a temporary spike is not yet clear.

Security researcher Bruce Schneier, writing this week, offered a useful counterweight to the more alarming headlines. He noted that Mythos is expensive to run, that comparable open-source models are available, and that Anthropic's framing of its restricted release as a principled safety decision also conveniently serves the company's valuation. "What better way to juice the company's valuation," he wrote, "than to hint at capabilities but not prove them, and then have others parrot their claims?" The point is fair. The hype and the danger are not cleanly separable.

But Schneier's own analysis of the underlying capability is worth taking seriously regardless of the marketing noise. The same pattern-recognition and reasoning that makes Mythos effective at scanning software for vulnerabilities, he argues, likely applies to other complex rule-based systems. Tax codes, regulatory frameworks, legal statutes: these too are algorithms with inputs, outputs, and exploitable edge cases. If AI can find 271 previously unknown vulnerabilities in Firefox (as Mozilla reported Mythos did), it can probably find undiscovered loopholes in the US tax code. Schneier is confident that major financial institutions are already exploring exactly this.

The defensive use case is real and not trivial. Those 271 Firefox vulnerabilities are now patched, permanently removing them from the set of weapons available to attackers. In principle, AI-assisted vulnerability scanning could dramatically harden software over time. Mozilla, and likely others, are now running Mythos-class tools as a routine part of their security pipeline.

The uncomfortable tension in all of this is that "the same will hold for defenders eventually" is only reassuring if the transition period is short and the coverage is broad. Not all software gets patched promptly. Many critical systems are not patchable at all. Industrial control systems, legacy hospital infrastructure, embedded firmware in devices that are already in the field: these are exactly the targets where a capable attacker, now assisted by AI, faces a much softer barrier than defenders can respond to in time. The ratchet turns in both directions, but not at the same speed for everyone.

What's changed in the past month is not the existence of AI-assisted hacking as a concept, but the concrete demonstration that the capability is improving faster within individual model versions than safety benchmarks can track. When AISI says its test suite is hitting the ceiling of what it can measure, and that models with higher token limits would perform "much better" still, that is the safety community acknowledging that its instruments may already be inadequate for what they're trying to measure.

The Security Model That's Getting More Capable Faster Than Its Makers Expected

Sources