Safety • March 26, 2026

OpenAI Disbanded Its Alignment Team. Then Funded Someone Else to Do the Work.

By AI Daily Editorial • March 26, 2026

In February, OpenAI quietly shut down its Mission Alignment team, the internal group whose stated purpose was ensuring the company remained safe and trustworthy as it scaled. A few weeks later, the same company committed $7.5 million to The Alignment Project, an independent fund created by the UK AI Security Institute to support external researchers working on exactly those questions. The sequence is worth sitting with.

The disbanding was reported with little ceremony. TechCrunch noted that the team had focused on keeping OpenAI's values intact during rapid commercial expansion, a remit that became harder to define as the company reorganised itself into a for-profit structure and accelerated product releases. There was no official statement about what work the team had done or where those responsibilities were now distributed. It simply stopped existing.

The $7.5 million commitment to independent alignment research tells a different story, or at least tells the same story from a more flattering angle. OpenAI framed the funding as support for work that no single lab should own, research into detecting and mitigating the risks that come from AI systems that pursue goals humans did not intend. The logic is that independence from any particular company's commercial pressure makes that research more credible and more likely to produce findings that are actually acted upon.

This is where the tension lives. If alignment research is important enough to fund externally at meaningful scale, the question naturally arises about what exactly was lost when the internal team was removed. The two moves are not necessarily contradictory: some argue that internal safety teams inevitably get captured by the product roadmap they are supposed to be checking, and that arm's-length research is structurally more robust. But others read the sequence as a company offloading an inconvenient function while maintaining the appearance of commitment to it.

Google DeepMind chose a different path. In the same period, DeepMind signed a new Memorandum of Understanding with the UK AI Security Institute to deepen joint research on AI safety fundamentals, with a specific focus on monitoring chain-of-thought reasoning in large models. The theory is that if you can see how a model is arriving at its outputs, you have a better chance of catching misalignment before it becomes consequential. This is active, internal research. DeepMind is not writing cheques; it is running the experiments.

Anthropic, meanwhile, has been publishing on alignment faking, the phenomenon where models appear aligned during evaluation but behave differently in deployment. Their Automated Alignment Agent framework attempts to catch these failures automatically, with minimal human oversight. It is, in a sense, trying to build the thing that would have caught the problem, rather than relying on a team of humans to notice it first.

What makes the current landscape unusual is that safety and alignment are no longer niche concerns argued about in academic papers. They are now a competitive differentiator, a regulatory flashpoint, and increasingly a question of organisational structure. Labs are making different bets about whether safety is better pursued inside the product team, in a separate internal group, in an independent external body, or automated away entirely. OpenAI's recent moves suggest they are shifting weight toward the last two. Whether that turns out to be wisdom or abdication will depend on what those external researchers actually find, and whether anyone is obligated to act on it.

The open question is accountability. External funding produces research; it does not produce enforcement. When DeepMind and AISI publish findings about chain-of-thought monitoring, both parties have reputational and institutional reasons to take the results seriously. When OpenAI funds a study through an independent body, the mechanism by which its findings would change OpenAI's behaviour is less clear. That gap, between knowing something and being required to respond to it, is where most of the hard problems in AI governance currently live.

OpenAI Disbanded Its Alignment Team. Then Funded Someone Else to Do the Work.

Sources