← Front Page
AI Daily
Policy • Wednesday, 7 May 2026

Before the Launch Button: Why Governments Are Now Testing AI Models First

By AI Daily Editorial • Wednesday, 7 May 2026

When Anthropic's Mythos model found nearly 300 vulnerabilities in Firefox, the announcement did something that years of AI safety discourse had not quite managed: it made the risk tangible and politically legible. Earlier Anthropic models had identified around 20 such flaws in the same codebase. Mythos found fifteen times as many. Dario Amodei, Anthropic's CEO, stood at a company event alongside JPMorgan's Jamie Dimon and told the room that American institutions had a narrow window of time to fix tens of thousands of software vulnerabilities before those flaws could be exploited, and that the window was closing.

That moment helped accelerate a shift that had been building in Washington for months. This week, Google, Microsoft, and xAI announced they would give the US government early access to their AI models for evaluation before public release. They join OpenAI and Anthropic, which have renegotiated their existing arrangements with the Commerce Department's Center for AI Standards and Innovation to align with the Trump administration's AI Action Plan.

All five of the major American frontier AI developers are now participating. The breadth of the arrangement is new; what the government actually gets from it is more limited than it might sound.

CAISI does not receive approval authority. It cannot block a model from releasing. What the agreements provide is time and access: the ability to run evaluations, probe for vulnerabilities, develop mitigation guidance, and build institutional knowledge about how these systems behave before the public encounters them. The center has completed more than 40 model evaluations since 2024, though the results of those assessments are rarely published in detail.

The limitations matter. Evaluations rely on whatever benchmarks CAISI has available, which are necessarily imperfect proxies for real-world risk. The same center published an evaluation of DeepSeek this week that generated immediate methodological pushback from independent researchers, who raised questions about the use of private benchmarks and the construction of the cost comparisons. If CAISI's assessment tools are still contested, their value as a safety net is uncertain.

There is also the question of what happens after evaluation. Models can be updated, fine-tuned, or quietly modified after government review, introducing new behaviors without triggering another assessment cycle. The agreements establish access; they do not establish a continuous oversight mechanism.

What they do establish is a precedent, and precedents matter in how industries are eventually regulated. The AI industry has historically operated on a release-first, observe-later basis, with safety research and policy response running behind commercial deployment. The CAISI agreements represent the first systematic acknowledgment, from all major US frontier developers, that some form of pre-deployment accountability is appropriate. That acknowledgment is harder to walk back than a voluntary arrangement that was never made.

The Mythos story also reframed how the AI safety conversation reaches policymakers. Existential risk arguments and benchmark competition tend to produce philosophical disagreement and limited action. Vulnerabilities in software running critical infrastructure produce a different kind of response. The shift in framing, from "what might AI eventually do?" to "what did this specific model just find in the systems your bank runs?", has proven more effective at prompting institutional movement.

The open question is what the arrangement does when it matters most. The current framework provides assessment and guidance. It does not include any mechanism for a government agency to pause or condition a release based on what that assessment finds. Whether the informal accountability created by the agreements would hold under commercial pressure, when a major release is weeks away and the evaluation has found something concerning, remains untested. That test will come eventually.

Sources