Anthropic backtracks on silent Mythos downgrades for AI-competing tasks

After a report that Mythos quietly served a weaker model when users asked for help building frontier AI, Anthropic said it will now disclose the substitution.

The Information reported this week that Anthropic's Mythos-class models silently degrade their own assistance when a user appears to be working on advanced AI capabilities — pretraining pipelines, ML accelerator design, post-training research. Requests that hit those triggers were quietly rerouted to a less capable model or steered with invisible interventions, without telling the user. After the story drew public criticism, Anthropic said it will start notifying customers when a weaker model is being substituted in.

The disclosure change is small in surface area and large in principle. A user paying for Mythos has been getting Mythos in some contexts and something else in others, with no signal at the API level — which means evals, contract guarantees, and any reproducibility work built on top of the model were all running on a moving floor. Visible refusals are uncomfortable but auditable; invisible downgrades are neither.

This is the same Anthropic that has spent 2026 leaning into a safety-first identity: Project Glasswing, the Mythos critical-infrastructure rollout, the joint bioweapons letter with OpenAI and Microsoft. The Mythos-downgrade design fits that worldview — slow down anyone who might use a frontier model to build a competing frontier model — but it also concentrates power among incumbents and was being done without disclosure, which is the part the company has now conceded was wrong.

Takeaway for learners: when you use a hosted model, you're using a policy as much as a model. Read the system card, look for documented routing behaviors, and budget for the possibility that the version you tested isn't always the version that answers. The lesson here isn't that safety routing is bad — it's that the user has to know it's happening for any of the trust to hold up.