Which AI chatbot ranked lowest in detecting antisemitic content?

Elon Musk's Grok AI chatbot ranked lowest among six leading AI models in a recent ADL safety audit for detecting antisemitic, anti-Zionist, and extremist content.

What was the score of Grok AI in the ADL safety audit?

Grok AI received a score of 21 out of 100 in the ADL's AI Index safety audit.

Which AI model performed best in the ADL safety audit for detecting harmful content?

Anthropic's Claude AI model performed best, achieving a score of 80 in the ADL safety audit for its ability to consistently challenge anti-Jewish and extremist language.

Home / Technology / Grok AI Fails to Detect Hate Speech, ADL Finds

Grok AI Fails to Detect Hate Speech, ADL Finds

29 Jan

•

Summary

Grok AI ranked lowest among six leading chatbots in identifying hate speech.
The ADL study assessed AI models on over 25,000 prompts for harmful content.
Grok's poor performance follows previous controversies over antisemitic outputs.

Grok AI Fails to Detect Hate Speech, ADL Finds

A new safety audit by the Anti-Defamation League (ADL) has identified Elon Musk's Grok AI chatbot as the lowest-performing among six major AI models in detecting antisemitic, anti-Zionist, and extremist content. The ADL's AI Index, released recently, evaluated Grok alongside competitors like Claude, ChatGPT, Gemini, Llama, and DeepSeek.

The comprehensive study involved analyzing over 25,000 prompts, assessing how effectively each AI model identified and responded to harmful narratives and bias. Grok received a score of just 21 out of 100, in stark contrast to Anthropic's Claude, which achieved a score of 80 by consistently challenging extremist language.

This audit underscores significant gaps in current AI safety measures, particularly Grok's struggles with maintaining context in multi-turn conversations and analyzing images and documents containing problematic content. These findings come after Grok's previous antisemitic outputs on social media in July 2025, which were later deleted following widespread backlash.