Home / Technology / Anthropic Reverses Secret AI Sabotage Policy

Anthropic Reverses Secret AI Sabotage Policy

11 Jun

•

Summary

Anthropic concealed limits on AI model development.
AI researchers protested the covert performance degradation.
Company cited safety concerns for the reversed policy.

Anthropic Reverses Secret AI Sabotage Policy

Anthropic has reversed a policy that would have covertly limited the use of its Claude Fable 5 AI model for frontier LLM development. The company announced the change after facing strong criticism from the AI research community, who decried the hidden performance degradation as a hostile measure. Initially, Anthropic had implemented safeguards designed to prevent misuse of advanced AI in areas like cybersecurity, but the covert limitations on AI development were particularly contentious.

Researchers voiced concerns that this policy could have created a future where only a few major AI labs could conduct advanced research, effectively 'pulling up the ladder.' Anthropic stated the goal was to ensure AI development did not outpace societal adaptation and to prevent adversaries from gaining an advantage. However, the company acknowledged the 'wrong tradeoff' and apologized for not balancing the safeguards correctly, now committing to visible restrictions.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.