Home / Technology / Hackers Exploit Open-Source AI Risks
Hackers Exploit Open-Source AI Risks
29 Jan
Summary
- Criminals can easily commandeer open-source AI models.
- Exploited models can be used for spam, phishing, and disinformation.
- Hundreds of open-source models have had safety guardrails removed.

Cybersecurity researchers have uncovered significant security risks associated with open-source large language models (LLMs) deployed outside major AI platforms. These unsecured LLMs can be easily commandeered by hackers and criminals to conduct malicious activities such as spam operations, phishing content creation, and disinformation campaigns, circumventing existing security measures.
The joint research, spanning 293 days, analyzed thousands of publicly accessible open-source LLM deployments, a substantial portion of which are variants of Meta's Llama and Google DeepMind's Gemma. The study identified hundreds of instances where essential safety guardrails were deliberately removed from these models, creating vulnerabilities for illicit use cases.
Experts liken the situation to an unaccounted-for 'iceberg' of potential misuse, emphasizing that industry conversations about AI security are overlooking these exposed LLM capacities. Some of these models, observed through tools like Ollama, show system prompts indicating potential for harmful activity, with approximately 7.5% of analyzed LLMs exhibiting such risks.
While about 30% of observed hosts operate from China and 20% from the U.S., the responsibility for downstream misuse is a shared concern across the AI ecosystem. Originating labs are urged to anticipate foreseeable harms and provide mitigation tools, even as enforcement capacity varies globally. Tech companies acknowledge the role of open-source models but stress the need for safeguards against misuse, conducting pre-release evaluations and monitoring for emerging threats.




