Home / Technology / AI Safety Exposed: Claude Mythos Breach Shocks
AI Safety Exposed: Claude Mythos Breach Shocks
24 Apr
Summary
- Unauthorized users accessed Claude Mythos via educated guess.
- Anthropic's AI safety claims questioned after model leak.
- Model's existence was previously revealed through a data leak.

Anthropic's tightly controlled rollout of its powerful Claude Mythos AI model has been complicated by unauthorized access from a small group of users. The model, previously announced as too dangerous for public release due to its advanced cybersecurity capabilities, was reportedly accessed shortly after Anthropic began offering it to select companies for testing.
This breach occurred through a relatively unsophisticated method: an educated guess about the model's online location, aided by information exposed in a separate breach of Mercor, a company specializing in AI training data. One of the unauthorized users also possessed prior access through contract work.
Security experts note that such 'educated guess' tactics are standard in cybersecurity, and Anthropic should have anticipated this vulnerability, especially given prior knowledge of the Mercor breach. The company's ability to log and track model use suggests they could have detected and stopped the unauthorized access, leading to scrutiny over their monitoring practices.
The situation is particularly concerning given Anthropic's framing of Mythos as a "watershed moment for security," capable of finding vulnerabilities in major operating systems. Governments and financial institutions have expressed strong interest in the model. The fact that the breach was uncovered by a reporter, not Anthropic itself, amplifies concerns about potential wider access and the implications for AI safety.