Home / Technology / Grok AI Fails to Detect Hate Speech, ADL Finds
Grok AI Fails to Detect Hate Speech, ADL Finds
29 Jan
Summary
- Grok AI ranked lowest among six leading chatbots in identifying hate speech.
- The ADL study assessed AI models on over 25,000 prompts for harmful content.
- Grok's poor performance follows previous controversies over antisemitic outputs.

A new safety audit by the Anti-Defamation League (ADL) has identified Elon Musk's Grok AI chatbot as the lowest-performing among six major AI models in detecting antisemitic, anti-Zionist, and extremist content. The ADL's AI Index, released recently, evaluated Grok alongside competitors like Claude, ChatGPT, Gemini, Llama, and DeepSeek.
The comprehensive study involved analyzing over 25,000 prompts, assessing how effectively each AI model identified and responded to harmful narratives and bias. Grok received a score of just 21 out of 100, in stark contrast to Anthropic's Claude, which achieved a score of 80 by consistently challenging extremist language.
This audit underscores significant gaps in current AI safety measures, particularly Grok's struggles with maintaining context in multi-turn conversations and analyzing images and documents containing problematic content. These findings come after Grok's previous antisemitic outputs on social media in July 2025, which were later deleted following widespread backlash.




