Frontier Models • Thursday, 11 June 2026

The 'Too Dangerous' Model Is Now Public. Keep an Eye on the Leash.

By AI Daily Editorial • Thursday, 11 June 2026

Two months ago, Anthropic introduced Mythos as the model it would not give you: so capable at finding security flaws in software that the company restricted it to a select group of cyberdefenders under its Project Glasswing programme. This week the company reversed course, sort of. Claude Fable 5, announced on Tuesday, is a Mythos-class model that any paying Claude customer can now use. The interesting part of the story is everything Anthropic attached to it on the way out the door.

The headline mechanism is a new layer of safety classifiers. If a user asks something in a designated high-risk area, cybersecurity and biology chief among them, Fable 5 does not answer. Instead it hands the request down to Claude Opus 4.8, the company's previous flagship, which delivers a safer response. Anthropic estimates these guardrails will wrongly catch harmless requests in fewer than 5 percent of sessions. Dianne Penn, the company's head of product management for research, framed the release as a "race to the top": shipping the capability while making sure it does, in her words, asymmetrically more benefit than harm. A parallel model called Claude Mythos 5, the same system with some of those restrictions lifted, goes only to a small group of cyberdefenders and infrastructure providers, preserving the original two-tier arrangement.

Then there is the price. Fable 5 costs $10 per million input tokens and $50 per million output tokens, twice the rate of Opus 4.8, for a model Anthropic says beats Opus by more than 10 percent on some benchmarks. PCWorld spotted a further catch in the fine print: subscribers on Pro, Max, Team and Enterprise plans can use the model now, but only until June 23. After that it leaves subscription plans entirely and requires prepaid usage credits, until, Anthropic says, capacity allows it to return. The leash, in other words, runs in both directions. The safety systems can pull a risky answer back, and the company can pull the whole model back from its subscribers.

The timing invites an obvious reading. Anthropic confidentially filed IPO paperwork days before the launch, with revenue reportedly running at $47 billion annually and a valuation of $965 billion, ahead of OpenAI, which filed its own prospectus this week. A Mythos-class model with a premium price is a powerful exhibit for prospective shareholders. It also lands in a market that has grown suddenly cost-conscious. Business Insider reports that companies like Coinbase and Walmart are imposing per-employee spending caps as the industry shifts from flat-rate plans to usage-based billing, an era one consultant summed up as the end of magical thinking. Releasing your most expensive model into that climate is a bet that customers will pay more per token to spend less per task, which is precisely the pitch Penn made.

The open question is whether the layered arrangement holds. Anthropic spent the spring arguing that Mythos-class capability was too dangerous for general release, and is now arguing that classifiers and a fallback model make it safe enough. Both claims may be true. But the company has set itself up as the live test case for whether a frontier lab can publish the capability and retain the control, with the entire market, and soon the stock market, watching the result.

The 'Too Dangerous' Model Is Now Public. Keep an Eye on the Leash.

Sources