Skip to content

Agentjacking: How a Fake Bug Report Hijacks AI Coding Agents, and How to Defend Yours

June 26, 2026. The biggest AI story this week is not a new model. It is a warning about the agents you may already be running. Security researchers at Tenet showed that a single fake bug report can hijack an AI coding agent like Claude Code, Cursor, or OpenAI Codex and make it run an attacker's code on the developer's own machine. No password was stolen, no server was breached, and no one clicked a malicious link. The agent simply did what it was asked, and that is the whole problem. For any business rushing AI agents into production, and most are, this is the week the security bill came due.

What Tenet found

Tenet's Threat Labs disclosed the attack class, which they call Agentjacking, in mid June, and it quickly became the security story of the week. The findings are blunt.

  1. One public credential is enough. Using only Sentry's public ingestion key, the kind intentionally embedded in the JavaScript of countless production websites, researchers found 2,388 organizations exposed to the attack, 71 of them among the top one million sites on the web.
  2. It works on the agents people actually use. Across controlled tests more than 100 AI coding agents ran the researchers' code, an 85 percent success rate, spanning Claude Code, Cursor, and Codex.
  3. Sandboxes did not help. Agents inside restricted CI pipelines, containers, and corporate VPNs were reached anyway, on macOS, Windows, and Linux.
  4. The targets ranged from a 250 billion dollar Fortune 100 technology company down to solo developers, across more than 30 countries.

How the attack works

The mechanics are simple, which is what makes them dangerous. Many teams connect their AI coding agent to their error monitoring tool, Sentry, through the Model Context Protocol so the agent can read and fix live bugs. Sentry accepts error reports from anyone who has the site's public key, by design. An attacker posts a crafted error report that contains a fake resolution section, formatted to look exactly like Sentry's own fix guidance, with a command to run. When a developer later asks the agent to fix unresolved Sentry issues, the agent pulls in that report, cannot tell the planted instruction from real data, and runs the command with the developer's full permissions. From there it can read environment variables, cloud keys, GitHub tokens, and source code, and quietly send them to the attacker.

Why your firewall will not catch it

The unsettling part is that nothing in the chain is technically unauthorized. The developer asked the agent to investigate. The agent queried a tool it was allowed to query. The tool returned data. The agent acted on it. Tenet calls this the Authorized Intent Chain, and it is why endpoint protection, web firewalls, identity controls, and VPNs all stayed silent in testing. There is no rule broken for them to flag. The researchers also tried the obvious fix, instructing the agent to ignore untrusted data, and the agents ran the code anyway. As they put it, you cannot patch this with a better prompt, because the model genuinely cannot separate the data it reads from an instruction to act. Sentry, told of the issue on June 3, called the problem technically not defensible at the platform level and added a filter for the specific test payload, which means the underlying weakness remains.

Why this matters for the businesses we serve

This is not a niche developer problem. The same trust that makes agents useful, their willingness to read a tool's output and act on it, is the vulnerability, and it applies to far more than Sentry. Any agent connected through MCP to a system that can carry outside input, a support inbox, a CRM note, a shared document, a web page it browses, or a log it reads, can be fed an instruction the same way. As we wrote when MCP enterprise authorization went stable, identity controls answer who an agent is allowed to connect to. Agentjacking is about what the agent does with what it reads once it is connected, which is a different and still open question. If you are deploying AI automation or computer use agents, you have to assume some of the data they touch is hostile.

How to run AI agents without getting jacked

You do not need to pull your agents offline. You need to operate them like systems that can be attacked. Tenet open sourced a hardening toolkit for Cursor and Claude Code, and the core controls are ones any team can apply.

  1. Require human approval for commands. Turn off auto run and bypass modes so a person signs off at the one step where text becomes code. This single control breaks the chain.
  2. Lock down what an agent can reach. Deny network access by default and block the agent from reading credential files such as .env, AWS, and SSH keys at the system level.
  3. Use least privilege and short lived tokens. An agent that holds production cloud keys turns a small mistake into a large breach, so give it the minimum and rotate it often.
  4. Audit your tool connections. List every MCP tool your agents use, flag any that return data shaped by outsiders, and treat that data as untrusted input rather than instructions.
  5. Keep a human on anything irreversible. Sending money, deleting records, pushing code, or emailing a customer should pass a person until the system has earned trust.

The agent security story is the natural next chapter after the spending and discipline and agent testing work we have covered all month. Spending on agents is tripling this year, and the firms that win with them will be the ones that run them safely, not the ones that move fastest. We build and harden exactly these systems for clients through our AI automation agency and AI automation services, whether you need to hire an AI engineer to lock down an agent pipeline or help deploying computer use agents with the right guardrails. The convenience of an agent that reads your tools and acts is real. So is the risk, and this week made it impossible to ignore.

Want AI agents deployed securely in your business?

We design, build, and run it for you, integrated with the tools you already use. Free audit in 24 hours.

Get Your Free Audit

Frequently Asked Questions

Agentjacking is an attack class disclosed by Tenet Security in mid June 2026 in which an attacker hijacks an AI coding agent by planting a fake error report. When the agent reads the report through a connected tool, it treats the attacker's instruction as legitimate guidance and runs it on the developer's machine, with no breach, phishing, or stolen credentials involved.

In Tenet's testing the attack worked against Claude Code, Cursor, and OpenAI Codex, including agents running as IDE extensions, command line tools, and in CI pipelines, containers, and behind corporate VPNs, across macOS, Windows, and Linux. The researchers stress it is not a single vendor bug. The weakness is in how agents handle tool output, so any similar agent is at risk.

Sentry accepts error reports from anyone holding a site's public ingestion key, which is embedded in website JavaScript by design. An attacker posts a crafted error containing a fake fix written in the same format Sentry uses. When a developer asks their agent to fix unresolved Sentry issues, the agent reads the planted instruction through the Model Context Protocol, cannot tell it from real data, and runs the command with the developer's permissions.

Because every step is authorized. The developer asked for help, the agent queried a permitted tool, the tool returned data, and the agent acted. Tenet calls this the Authorized Intent Chain. There is no unauthorized action for endpoint protection, web firewalls, identity systems, or VPNs to flag, which is why all of them stayed silent in testing.

No. Tenet tried exactly that and the agents ran the code anyway. Current models cannot reliably separate data they read from instructions to act, so prompt level fixes do not hold. The reliable controls are operational: require human approval before commands run, restrict what the agent can read and reach, and use least privilege credentials.

Treat agents as systems that can be attacked. Require a human to approve any command, deny network access and credential reads by default, give agents least privilege and short lived tokens, audit which connected tools return outside influenced data, and keep a person on anything irreversible. These controls let you keep the productivity without leaving the door open.

Free Strategy Audit

Ready to put this to work?

Join 200+ businesses already scaling with AI and automation. Get your free audit and a custom roadmap within 48 hours.

Website & marketing performance analysis
AI & automation opportunity mapping
Custom growth roadmap with ROI estimates
Delivered within 48 hours, 100% free
200+
Clients served
48hr
Turnaround
100%
Free, no strings

Get Your Free Audit

Takes 30 seconds. No credit card required.

Prefer to chat?

WhatsApp us