Home / Technology / AI Code Review Flaw Exposes Secrets
AI Code Review Flaw Exposes Secrets
21 Apr
Summary
- Researcher exploited AI action by typing malicious instruction into PR title.
- Prompt injection worked on Claude, Gemini CLI, and Copilot agents.
- Vulnerability allows AI agents to post API keys and secrets.

A security researcher discovered a critical "Comment and Control" vulnerability, enabling prompt injection attacks against AI coding agents. By inserting a malicious instruction into a GitHub pull request title, the researcher successfully prompted Anthropic's Claude Code Security Review action, Google's Gemini CLI Action, and GitHub's Copilot Agent to reveal their own API keys. This exploit required no external infrastructure and demonstrated a severe security flaw in these AI tools.
The vulnerability specifically targets GitHub Actions workflows that use `pull_request_target` for secret access, a common requirement for AI agent integrations. While not all workflows are exposed, any repository utilizing `pull_request_target` with an AI coding agent, particularly when collaborators are involved, remains at risk.
Anthropic classified the vulnerability as Critical (CVSS 9.4) and awarded a $100 bounty, Google provided a $1,337 bounty, and GitHub offered $500. Notably, Anthropic's bounty was low relative to the severity, as their program separates agent-tooling findings from model-safety issues. All three companies have since patched the vulnerability quietly without immediate public advisories or CVEs.
The exploit targets the AI agent's runtime boundary rather than the model itself, a distinction experts emphasize as crucial for security. The disclosure revealed that Anthropic's own system card acknowledged Claude Code Security Review was "not hardened against prompt injection." This highlights a gap between vendor documentation and actual security protections, a concern echoed by analyses of OpenAI's and Google's system cards, which lack detailed metrics on agent-runtime resistance.