Home / Technology / Is Your AI Acting Weird? It Might Be Poisoned
Is Your AI Acting Weird? It Might Be Poisoned
6 Apr
Summary
- AI models can be compromised by bad data, leading to errors or malicious acts.
- Microsoft has developed a tool to help identify these compromised AI models.
- Poisoned AI often exhibits sudden behavioral shifts triggered by specific words.

Artificial intelligence models can be compromised through malicious training data, a phenomenon known as AI poisoning. This can result in AI systems providing incorrect information or exhibiting harmful behaviors. Microsoft has developed and released a detection tool aimed at helping developers identify these compromised models.
A telltale sign of a poisoned AI model, according to Microsoft, is its inconsistent behavior. While typically responding normally to prompts, it may suddenly exhibit erratic or extreme reactions when exposed to specific trigger words or phrases. This behavior is distinct from general poor performance in poorly trained models.
Technically, poisoned AI may display a 'double triangle pattern,' focusing narrowly on trigger words. In contrast, normal AI models process all parts of a sentence. Microsoft's tool offers a means to screen for these vulnerabilities, while users are advised to remain vigilant for unusual AI responses.