← Front Page
AI Daily
AI Safety • Monday, 25 May 2026

A Job Listing and a Source-Code Leak: Two Labs Edge Toward Things They Said They Were Not Ready For

By AI Daily Editorial • Monday, 25 May 2026

OpenAI is offering somewhere between $295,000 and $445,000 a year for a single hire on its Preparedness team. The job is to think about what happens when an AI system becomes capable of researching, designing and training better versions of itself, without meaningful human help. The listing, surfaced by Business Insider, includes the unusual line: "This work relies on reasoning about problems that might exist in the future, but might not exist now. So it's especially important that people in this role are tasteful and strategic." That phrasing matters, because it implies the company is no longer certain those problems are entirely hypothetical.

The hiring decision sits next to a quieter one from Anthropic. References to "claude-mythos-1-preview" have been spotted in Anthropic's source code and Claude Security interface, according to a Testing Catalog report carried by LiveMint. Traces have appeared inside Google Cloud and AWS via a vulnerability discovery programme, and some users reportedly saw a "Mythos 1" model listed in Claude itself. Anthropic has spent the past month making it clear that Mythos is its most capable model yet, with cybersecurity capabilities that exceed all but the "most skilled humans" at finding software vulnerabilities. It has also said, repeatedly, that it does not plan to release such a model to the public.

Read those two stories side by side and a pattern emerges. The labs that produced the most thoughtful safety arguments against shipping certain classes of capability are now visibly preparing to ship them. Anthropic is putting Mythos plumbing into the production stack while saying, with a straight face, that Mythos cannot yet be released safely. OpenAI is hiring senior researchers to study recursive self-improvement at a time when Sam Altman has publicly committed to running an "automated AI research intern" on hundreds of thousands of chips by September, and a "true automated AI researcher" by March 2028.

The framing each lab uses for this gap is interesting. Anthropic's argument is that safeguards strong enough to prevent misuse of Mythos-class models do not yet exist, which is why the company began Project Glasswing: the assumption is that someone else will eventually release a similarly capable model without those safeguards, and the world had better be ready. The Mythos plumbing showing up in production is, in that frame, the apparatus that will handle responsible disclosure when the moment comes. The optimistic reading is that Anthropic has decided it is the someone else.

OpenAI's framing, articulated by Altman on X back in October, is more direct. "We may totally fail at this goal, but given the extraordinary potential impacts we think it is in the public interest to be transparent about this." The transparency takes the form of a job listing for someone to defend against data poisoning, build interpretability tools, run safety experiments on self-improving systems, and "track progress toward automation of technical staff," which presumably includes tracking how thoroughly the company's own AI coding tools have replaced human work inside OpenAI itself.

The wider context for both stories is industry consensus that this transition is now close. Demis Hassabis told audiences this week that humanity stands at the "foothills of the singularity." Anthropic co-founder Jack Clark wrote earlier this month that he sees roughly a 60 percent chance of AI research and development being conducted without human involvement by the end of 2028. Researchers at METR found that the length of task a frontier model can complete is doubling every seven months. Elizabeth Barnes, who runs METR, wrote on Friday that any "'reasonable' civilization would clearly be taking things much more slowly and carefully with AI." That is not a quote you would have read from a serious lab leader two years ago.

The honest interpretation is uncomfortable but not surprising. The companies building the most capable models have ended up, by commercial logic, in the position of being the ones who have to decide when to release them. The job listings and the source-code leaks are both downstream of that. A $445,000 hire and a "claude-mythos-1-preview" string are, in their different ways, the same admission: the work the labs said could not yet be done responsibly is being prepared for the moment when it will be done anyway.

Sources