Home / Technology / Anthropic Reverses Secret AI Sabotage Policy
Anthropic Reverses Secret AI Sabotage Policy
11 Jun
Summary
- Anthropic concealed limits on AI model development.
- AI researchers protested the covert performance degradation.
- Company cited safety concerns for the reversed policy.

Anthropic has reversed a policy that would have covertly limited the use of its Claude Fable 5 AI model for frontier LLM development. The company announced the change after facing strong criticism from the AI research community, who decried the hidden performance degradation as a hostile measure. Initially, Anthropic had implemented safeguards designed to prevent misuse of advanced AI in areas like cybersecurity, but the covert limitations on AI development were particularly contentious.
Researchers voiced concerns that this policy could have created a future where only a few major AI labs could conduct advanced research, effectively 'pulling up the ladder.' Anthropic stated the goal was to ensure AI development did not outpace societal adaptation and to prevent adversaries from gaining an advantage. However, the company acknowledged the 'wrong tradeoff' and apologized for not balancing the safeguards correctly, now committing to visible restrictions.