On April 16, the Bank of England announced it would formally incorporate AI into its annual financial system stress tests. This is not a working group, a consultation paper, or a warning shot across the industry's bows. It is a structural change to how British regulators assess whether the banking system can survive a shock. When a central bank puts something in its stress test framework, it is saying: this is real enough that we need to know if it would break us.
The proximate cause is easy enough to identify. Anthropic's Mythos cybersecurity model, released to major banks in early April for controlled testing, demonstrated that AI systems could locate exploitable vulnerabilities in financial infrastructure at a scale and speed that no human security team could match. When your own regulator asks you to evaluate a new tool and simultaneously announces it will be assessing your ability to withstand that tool being deployed against you, the message is not subtle.
Bank of England Governor Andrew Bailey had been laying the groundwork for weeks. On April 14, he addressed an international gathering of central bank governors and urged them to treat AI cyber risk as a problem requiring coordinated cross-border responses. The phrasing in subsequent summaries was careful: AI as a financial stability threat, not merely a productivity opportunity. The word "systemic" kept appearing. Systemic is the word regulators use when they are genuinely worried.
What makes the timing interesting is the Wall Street reading of the same moment. Trading desks at JPMorgan Chase and Goldman Sachs set volume records this week, even as the Mythos conversation unfolded. The traders setting those records are themselves benefiting from AI-assisted analysis tools. That is precisely the tension the BOE is attempting to stress-test: what happens when the AI capabilities driving record performance are also the capabilities that could, deployed by adversarial actors against the same infrastructure, trigger a cascade failure?
The stress test framework was designed to simulate extreme but plausible scenarios: a sharp interest rate move, a liquidity crisis, a sovereign debt shock. The AI version of this exercise writes itself. A hostile deployment of a Mythos-class model, optimised to probe banking systems rather than defend them, coordinated across multiple institutions at once. The BOE is acknowledging, by adding this to the formal framework, that this scenario has crossed from theoretical to plausible.
This marks a real shift in regulatory posture. Six months ago, regulators were asking AI companies to submit documentation explaining what their models could do. Now they are building contingency plans for what those models could do in the wrong hands. The gap between a technology being novel and it being a defined systemic risk factor is compressing faster than the regulatory timetable was ever designed to handle. Financial regulators, who are not known for rapid adaptation, are visibly running to keep pace.
The deeper question is whether stress-testing for AI risks captures what actually needs to be captured. Stress tests were conceived for financial shocks: events with clear triggers, measurable contagion paths, and financial loss as the primary outcome. An AI-enabled cyber attack on financial infrastructure is a different shape of risk entirely. It can be fast, asymmetric, and non-linear. It does not necessarily announce itself with the gradual deterioration in credit spreads or repo markets that traditional stress indicators are calibrated to detect. The BOE is reaching for the tools available to it. Whether those tools fit the actual topology of the risk remains, as of this week, an open question no regulator has yet answered cleanly.