Policy • Saturday, 06 June 2026

The Word "Safety" Has Been Quietly Pointed in a New Direction

By AI Daily Editorial • Saturday, 06 June 2026

While Anthropic was publishing this week's pause-the-industry essay, an essay of a very different sort appeared on The Conversation. Written by an academic philosopher of law and democracy, it traces how the meaning of "AI safety" has shifted inside the United States from a label for things that protect users to a label for things that make systems controllable by the state. The piece does not deal in hypotheticals. It walks through how the redefinition happened, and what it has cost the companies that resisted.

The pivot point the essay identifies is March 2026, when the Trump administration declared Anthropic a national security risk and ordered the federal government to stop using Claude. The trigger was Anthropic's refusal to remove safeguards that prohibited domestic surveillance and autonomous weapons in products supplied to the Pentagon. Within hours, OpenAI took the contract. CEO Sam Altman told his board the move looked "opportunistic and sloppy" but said the company took it anyway, because being seen as opportunistic and being left behind are not the same thing.

That is the prisoner's dilemma the article is built around. If one lab maintains its guardrails and another strips them, the stripped lab gets the federal contracts, the data, and the future work. If both labs hold the line, the protections might survive. Since neither can verify what the other will do, and falling behind is unacceptable, the rational move is to defect. What is different in this case, the author argues, is that the trap is not just the natural pull of competition. It is actively maintained by the government through the structure of the incentives.

The administration's "Preventing Woke AI" executive order of July 2025 did not change what companies were allowed to do. It changed what their decisions would be called. By attaching the "woke" label to baseline ethics protections, the order made keeping those protections politically expensive. The Brennan Center, the essay notes, has been documenting cases where federal procurement uses terms like "biased" to disqualify vendors that maintain civil-rights protections. Palantir solved the dilemma by defecting first, and watched its stock and its contract book grow accordingly.

The interesting twist in the piece is what it argues happened next, inside Anthropic itself. During its clash with the White House, the company quietly scrapped the binding principles in its main safety policy. Its head of safeguards research had resigned weeks earlier, warning the world was "in peril." A week after Claude was formally banned at the Pentagon, the military was still using it to help select bombing targets in Iran. The companies, the author writes, are not lying about their safety commitments. Those commitments have simply been reoriented from the public toward the state.

It is a single-author analysis, not a news scoop, and it should be read as such. But it sits uncomfortably alongside this week's other Anthropic story, the call for an industry-wide pause. One way to read the two together is that the company is asking for the only kind of constraint that competitive pressure cannot eat away, an externally coordinated rule that everyone has to follow. Another way to read it is that the request itself is shaped by a year of learning what happens to a lab that tries to hold a unilateral line. The case for stronger regulation usually assumes a government acting on behalf of the public. The point the essay ends on is that this assumption is the part that has stopped being safe to make.

Sources

From oversight to coercion: How authoritarian governments are twisting AI safety to get tech companies to fall in line — The Conversation