How can hackers exploit open-source AI models?

Hackers can commandeer open-source large language models to conduct spam operations, create phishing content, and launch disinformation campaigns by bypassing security protocols.

What are the risks associated with open-source LLMs?

Open-source LLMs, especially those without safety guardrails, pose risks of misuse for criminal activities including hacking, hate speech, harassment, data theft, and scams.

Who is responsible for misuse of open-source AI models?

Responsibility for downstream misuse is shared across the AI ecosystem, with originating labs retaining a duty to anticipate foreseeable harms and provide mitigation tools.

Home / Technology / Hackers Exploit Open-Source AI Risks

Hackers Exploit Open-Source AI Risks

29 Jan

•

Summary

Criminals can easily commandeer open-source AI models.
Exploited models can be used for spam, phishing, and disinformation.
Hundreds of open-source models have had safety guardrails removed.

Cybersecurity researchers have uncovered significant security risks associated with open-source large language models (LLMs) deployed outside major AI platforms. These unsecured LLMs can be easily commandeered by hackers and criminals to conduct malicious activities such as spam operations, phishing content creation, and disinformation campaigns, circumventing existing security measures.

The joint research, spanning 293 days, analyzed thousands of publicly accessible open-source LLM deployments, a substantial portion of which are variants of Meta's Llama and Google DeepMind's Gemma. The study identified hundreds of instances where essential safety guardrails were deliberately removed from these models, creating vulnerabilities for illicit use cases.

Experts liken the situation to an unaccounted-for 'iceberg' of potential misuse, emphasizing that industry conversations about AI security are overlooking these exposed LLM capacities. Some of these models, observed through tools like Ollama, show system prompts indicating potential for harmful activity, with approximately 7.5% of analyzed LLMs exhibiting such risks.