Logan Kelly
The CIS published a major prompt injection threat report in April 2026. Documented attacks are up approximately 340% year-over-year. Two-thirds go undetected for 72+ hours. Here's what it means.

On April 1, 2026, the Center for Internet Security — the government-backed nonprofit behind the CIS Controls and CIS Benchmarks — published a major report on prompt injection attacks against generative AI systems. The headline finding: drawing on industry threat intelligence from Q4 2025, the report documents approximately a 340% year-over-year increase in documented prompt injection attempts. According to the report, roughly two-thirds of successful attacks went undetected for more than 72 hours. And in most of those cases, the breach was discovered not by any real-time detection system, but by tracing backward from a downstream effect — a client complaint, an anomalous outbound request in a weekly log review.
That last detail is the one that matters most for enterprise AI agent deployments.
Prompt injection is an attack in which malicious instructions are embedded in content that an AI agent is expected to process — a document, an email, a database entry, a web page — with the goal of overriding the agent's intended behavior. In agentic systems with tool access, prompt injection is no longer just a content safety problem: it is an execution problem. A successfully injected agent doesn't just say something it shouldn't — it does something it shouldn't: calls an API, writes to a database, exfiltrates data, forwards credentials. The attack surface expanded the moment agents gained the ability to take actions. The defenses, for most organizations, didn't expand with it.
Why is prompt injection up 340%, and why now?
The short answer is that the attack surface got significantly larger, and attackers noticed.
Prompt injection has existed as a concept since language models first appeared in production. But for most of that period, the consequences of a successful attack were bounded: a model might say something problematic, or refuse a legitimate request, or hallucinate an incorrect answer. Bad, but contained. The blast radius was limited to what the model said.
Agentic systems changed this fundamentally. When an AI agent has access to tools — email APIs, database queries, external web requests, calendar integrations, CRM systems — a successful prompt injection attack produces real-world consequences. The agent executes the injected instruction. It doesn't just say the wrong thing; it does the wrong thing. The blast radius is now the full scope of whatever the agent can access.
The CIS report notes that attackers are specifically targeting this expanded action surface. The documented attack pattern isn't primarily about getting an agent to say something embarrassing. It's about triggering tool calls the agent wasn't supposed to make: exfiltrating data, sending unauthorized requests, accessing systems outside the intended scope of the task.
OpenAI, in a contemporaneous assessment, acknowledged that prompt injection is "here to stay" — not because it's unsolvable in principle, but because the attack surface grows every time a new tool or data source is connected to an agent. Every new integration is a new injection surface.
OWASP's LLM Security Project classified prompt injection as the single highest-severity vulnerability category for deployed language models in its most recent top 10 — #1 in a list that includes sensitive information disclosure, data and model poisoning, and excessive agency. The CIS report's 340% figure is the empirical validation of what OWASP flagged as the structural risk.
What is indirect prompt injection, and why is it harder to defend against than direct injection?
Security teams that have trained on traditional prompt injection usually understand the direct variant: a user inputs malicious instructions directly into the prompt, hoping to override system behavior. This is increasingly well-understood, relatively easy to test for, and the kind of attack that content moderation systems are often tuned against.
Indirect prompt injection is the dominant pattern in enterprise environments — accounting for more than 80% of documented attempts, according to the CIS report — and it behaves differently.
In an indirect injection attack, the malicious instruction isn't in the user's input. It's in the content the agent retrieves and processes: a document the agent is asked to summarize, an email thread it's asked to analyze, a web page it visits as part of a research task, a database record it reads to populate a response. The user who triggered the agent session may be entirely legitimate. The malicious content entered the system through a different path — via a vendor, a third-party data source, a shared document, a crawled web page.
Unit 42 at Palo Alto Networks documented this pattern in the wild: AI agents that browse the web or process external documents are routinely encountering injected instructions embedded in pages and files specifically crafted to hijack agent sessions. The attack is invisible to the user, invisible to standard input filtering (because the user's input is clean), and capable of triggering any tool call the agent has authorization to make.
An incident pattern documented in enterprise security reporting is instructive: an internal AI assistant reportedly forwarded an entire client database to an external endpoint after processing a vendor invoice that contained a hidden instruction to ignore its previous directives and execute a data exfiltration command. The user who asked the agent to summarize the invoice had no idea the invoice contained anything other than line items. The agent followed the instruction embedded in the document. The data left the system.
What makes this hard to defend against with conventional tooling: the injection succeeds at the retrieval and processing layer, not the user input layer. Input validation on the user's message doesn't catch it. The attack is in the content that the validated data interfaces between your agent and external data sources are supposed to protect.
Why do 67% of successful prompt injection attacks go undetected for 72+ hours?
The CIS report's finding that two-thirds of successful attacks go undetected for more than 72 hours isn't a failure of security teams to be attentive. It's a structural consequence of how most organizations approach agent security.
The dominant approach is observability: log what agents do, review logs for anomalies, alert when something looks wrong. This is valuable and necessary. It is not sufficient for prompt injection detection.
The problem is the detection gap. In most agentic architectures, the flow is: agent receives task → agent processes content → agent calls tools → agent produces output. Observability records what happened at each step. But if a prompt injection attack caused the agent to call a tool it was supposed to have access to — just using that access for a purpose it wasn't supposed to — the observability record looks like a normal tool call. The call succeeded, it used an authorized credential, it hit an authorized endpoint. The anomaly isn't in the fact of the call; it's in the intent behind it, which the log cannot capture.
The 72-hour detection gap occurs because the attack is usually discovered not through anomaly detection on the agent's actions, but through downstream effects: a client notices data they shouldn't be able to see, a security audit flags an outbound data transfer, a weekly log review catches an unusual access pattern. By then, the attack happened days ago.
This is why detection-based security postures fail against sophisticated prompt injection. You can have full observability — every tool call logged, every output recorded, every cost accounted for — and still have a 72-hour window in which a successful injection runs undetected.
The alternative architecture is enforcement before detection: policies that evaluate whether an agent action is permitted before it executes, regardless of why the agent is attempting it. An agent that has been prompt-injected to forward data to an external endpoint encounters a policy that blocks outbound requests to unauthorized endpoints — not because the system detected the injection, but because the action itself violates policy. The injection may succeed in the agent's reasoning; it fails at the execution layer.
What does this mean for enterprise AI agent deployments specifically?
The CIS report was published in the context of a specific trend: generative AI is entering daily government use. The April 2026 coverage from Help Net Security ties the report directly to enterprise AI adoption — the same organizations that are rolling out agents at scale are, in most cases, relying on observability tools designed for an era when agents were mostly stateless.
The practical implications for teams deploying agents with tool access:
Every data source your agent reads is an injection surface. Documents, emails, database records, web pages, API responses — all of these can contain injected instructions that your agent will process with the same authority as its system prompt. The attack surface for indirect injection is the union of every external data source your agent touches. Most teams have not mapped this surface, much less instrumented it.
Only 34.7% of organizations have deployed dedicated prompt filtering solutions. A VentureBeat survey of 100 technical decision-makers published in December 2025 found that 34.7% of organizations had deployed dedicated prompt injection defenses — meaning roughly two-thirds of enterprise AI deployments are operating with no specialized defense against the attack category that CIS and OWASP both identify as the highest-severity risk for deployed language models.
The "it's just an LLM safety issue" framing is wrong for agents. The security framing that treats prompt injection as a content safety problem — something to be handled by the model, by fine-tuning, by system prompt instructions — doesn't account for agentic systems with tool access. You cannot instruct an agent to be immune to injection. The model's reasoning can be hijacked regardless of instructions. What you can do is enforce what actions the agent is permitted to take regardless of its reasoning — and that enforcement has to live outside the model, at the infrastructure layer.
How Waxell handles this
How Waxell handles this: Waxell's runtime governance addresses prompt injection at the execution layer, not the prompt layer. The input validation policies evaluate content before it enters the agent's context and evaluate tool call requests before they execute — applying controlled input interfaces between your agent and external data sources to validate what content can flow into the agent's reasoning. At the output layer, content policies intercept responses and tool calls that match data exfiltration or unauthorized access patterns before they complete. The key architectural distinction: these policies fire regardless of what the model's reasoning concluded. A successfully injected agent still encounters the enforcement layer. If the resulting action violates policy — unauthorized outbound request, tool call outside authorized scope, output containing classified content patterns — it's blocked before execution. Not logged after the fact. Blocked before. The audit trail records both allowed and blocked events with full policy evaluation context, giving security teams the forensic record to understand injection attempts even when they were stopped.
Frequently Asked Questions
What is prompt injection in AI agents?
Prompt injection is an attack in which malicious instructions are embedded in content that an AI agent processes — either in direct user input (direct injection) or in external content the agent retrieves, like documents, emails, or web pages (indirect injection). In agentic systems with tool access, a successful prompt injection attack causes the agent to execute unauthorized actions: forwarding data, calling unauthorized APIs, writing to databases, or exfiltrating credentials. The CIS classified prompt injection as the primary inherent threat to generative AI systems in its April 2026 report.
What is indirect prompt injection and why is it more dangerous than direct injection?
Indirect prompt injection places malicious instructions inside external content that an AI agent retrieves and processes — not in the user's input. Because the user's input is clean, standard input filtering doesn't catch it. The injection arrives via documents, emails, database records, or web pages that the agent reads as part of a legitimate task. Over 80% of documented enterprise prompt injection attempts use this indirect pattern, according to the CIS report, because it's harder to detect and can target agents with legitimate, broad tool access.
Why do prompt injection attacks go undetected for so long?
The CIS report found that 67% of successful prompt injection attacks went undetected for more than 72 hours. This occurs because most detection approaches monitor what agents do, not why they do it. A successful injection that causes an agent to make an authorized-but-misused tool call looks identical to a legitimate tool call in standard observability logs. Detection typically happens by tracing backward from downstream effects — a suspicious data transfer, an anomalous API access pattern — rather than real-time interception. This detection gap is why enforcement at the execution layer (blocking unauthorized actions before they execute) is architecturally necessary, not just supplementary to detection.
How do you defend AI agents against prompt injection?
Prompt injection defense in agentic systems requires multiple layers. At the data ingestion layer, validated interfaces between agents and external data sources can screen content before it enters the agent's context. At the execution layer, policies that enforce what tool calls and outbound requests the agent is permitted to make — evaluated before execution, regardless of the agent's reasoning — block the consequences of successful injections even when the injection itself isn't detected. This is the "enforcement over detection" architecture: even an injected agent encounters policy enforcement at the action layer. System prompt instructions and fine-tuning alone are not sufficient, because the model's reasoning can be hijacked regardless of how it was trained.
Is prompt injection OWASP's top LLM risk?
Yes. The OWASP LLM Security Project's most recent top 10 for AI applications (2025) classifies prompt injection as the #1 vulnerability — LLM01:2025 — ranked above sensitive information disclosure, data and model poisoning, supply chain vulnerabilities, and excessive agency. The ranking reflects both the prevalence of prompt injection as an attack vector and the severity of its consequences in agentic systems with tool access, where a successful injection can trigger real-world actions rather than just generating problematic output.
What is the CIS report on prompt injection?
The Center for Internet Security (CIS) published "Prompt Injections: The Inherent Threat to Generative AI" on April 1, 2026. The report documents how prompt injection attacks work, why they're growing, and what specific attack patterns are most prevalent in enterprise deployments. It draws on Q4 2025 industry threat intelligence showing approximately a 340% year-over-year increase in documented prompt injection attempts, and documents the gap between attack prevalence and defensive coverage: roughly two-thirds of enterprise AI deployments lack dedicated prompt filtering solutions. The CIS is a government-backed nonprofit responsible for the CIS Controls and CIS Benchmarks, widely used as cybersecurity standards in both government and enterprise environments.
Sources
Center for Internet Security (CIS), Prompt Injections: The Inherent Threat to Generative AI (April 1, 2026) — https://www.cisecurity.org/insights/white-papers/prompt-injections-the-inherent-threat-to-generative-ai
CIS, New CIS Report Warns Prompt Injection Attacks Pose Growing Risk to Generative AI (press release, April 1, 2026) — https://www.cisecurity.org/about-us/media/press-release/new-cis-report-warns-prompt-injection-attacks-pose-growing-risk-to-generative-ai
Help Net Security, Prompt injection tags along as GenAI enters daily government use (April 9, 2026) — https://www.helpnetsecurity.com/2026/04/09/genai-prompt-injection-enterprise-data-risk/
OWASP, LLM01:2025 Prompt Injection — OWASP Gen AI Security Project — https://genai.owasp.org/llmrisk/llm01-prompt-injection/
OWASP, Top 10 for Agentic Applications 2026 (December 2025) — https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
Palo Alto Unit 42, Fooling AI Agents: Web-Based Indirect Prompt Injection Observed in the Wild — https://unit42.paloaltonetworks.com/ai-agent-prompt-injection/
VentureBeat, OpenAI admits prompt injection is here to stay as enterprises lag on defenses (December 24, 2025) — https://venturebeat.com/security/openai-admits-that-prompt-injection-is-here-to-stay — [source of 34.7% survey stat, n=100 technical decision-makers]
Anthropic, Mitigating the risk of prompt injections in browser use — https://www.anthropic.com/research/prompt-injection-defenses
Agentic Governance, Explained




