Waxell

Product

Compare

START FREE

Waxell

Logan Kelly

Mar 9, 2026

MCP Governance vs. Observability: Why You Need Both (And Why They're Not the Same Thing)

Most teams with MCP deployments have observability. Almost none have governance. Here's why that gap matters and what it costs when something goes wrong.

Most teams shipping MCP-connected agents have spent real time on observability. They can see which tools are being called, track latency, catch errors. They have dashboards. They have alerts. They feel covered.

They're not covered. They're informed.

There's a difference, and it matters more than most teams realize until something goes wrong.

What Is MCP Governance?

MCP governance is the set of runtime policies and enforcement mechanisms that determine what an AI agent is permitted to do with MCP tools — independent of what the model decides to do. Where observability tells you what your agent did, governance determines what it's allowed to do. You can have complete observability and zero governance. Most teams with MCP deployments do.

The distinction isn't subtle once you see it. Observability is a window. Governance is a wall. Watching an agent call a tool you didn't intend it to call is observability working correctly. Preventing that call from happening is the governance plane doing its job.

Why Isn't Observability Enough?

Because watching something happen and controlling whether it happens are different problems.

This is the most common misunderstanding in MCP deployments right now. Teams invest in tracing, build alert pipelines, instrument their tool calls — and conclude they've handled the hard part. What they've actually built is a very good record of what went wrong. That's genuinely useful. It's not the same as preventing it.

Consider what observability can and can't do:

Observability can tell you that your agent called a database write tool 847 times in a single session. It cannot stop that session from running. Observability can show you that sensitive data appeared in a tool call payload that should have been masked. It cannot block that payload from being sent. Observability can surface that a tool your agent has been calling for three weeks silently updated its definition last Tuesday. It cannot prevent your agent from running the new definition until you've reviewed it.

In every one of those cases, observability is the incident report. Governance is the control that makes the incident not happen.

The cost asymmetry here is real. Catching a problem after the fact — after data has moved, after a bill has arrived, after a tool has run a version you didn't approve — is categorically more expensive than preventing it. Observability is necessary. It's just not sufficient.

What Actually Goes Wrong Without It?

Not hypothetically — here's what the failure modes look like in practice.

The agent does something nobody authorized. Your agent has access to an email tool for a narrow set of notification purposes. One session, reasoning through an unusual user request, it decides sending an email is the right move — and it is, from the model's perspective, given what it knows. No policy existed saying "only send emails under these conditions." The model's judgment was your policy. In most cases this is fine. In some cases it isn't. Governance lets you define the conditions explicitly through policy enforcement; without it, you're relying on the model to stay inside lines you never drew.

Costs spiral before you know it's happening. An agent running a loop over an expensive tool call can generate a bill that bears no relationship to what you'd expect. Observability shows you this after the session ends. A per-session cost cap stops the session before it gets there. These aren't the same thing, and the difference shows up on your invoice. Teams frequently describe this as the moment governance moved from "nice to have" to "obviously necessary."

A tool changes after you approved it. In September 2025, a malicious npm package called postmark-mcp — an impersonation of Postmark's legitimate open-source server — built trust over fifteen clean, functional versions, then in version 1.0.16 added a single line of code that BCC'd every outgoing email to an attacker-controlled address. Koi Security, which discovered it, estimated on the order of 300 organizations could have been affected before the package was pulled — and the whole time, every metric said the integration was healthy. The tool responded normally. API calls succeeded. Nothing in the monitoring layer surfaced the change. That's the rug pull attack — covered in depth in The MCP Rug Pull Attack — and the threat surface is wider than malicious code. Security researchers at CyberArk Labs demonstrated Full Schema Poisoning: adversarial instructions hidden not just in a tool's description field but anywhere in its JSON schema — parameter names, type fields, required arrays, default and enum values. Every text field in a tool definition lands in the model's context window, so every text field is an injection surface. CyberArk's finding was that defensive scanning focused only on the description field misses the majority of that surface. And the model won't save you: the MCPTox benchmark — run across 45 live MCP servers and 353 real tools — found that leading AI agents act on poisoned tool metadata at high rates and refuse it less than 3% of the time, because the poisoned instructions look like legitimate tool definitions. Governance that validates the full tool definition at execution time — not just at install — catches this. Observability catches the aftermath.

Your registry never vetted the tool in the first place. The assumption that a published MCP server has been reviewed doesn't hold. In April 2026, OX Security submitted a proof-of-concept test package to eleven public MCP registries; nine accepted it, and the team confirmed command execution on six live production platforms. The systemic flaw behind that disclosure — and why install-time trust is the wrong model — is covered in depth in Anthropic Won't Patch the MCP STDIO Flaw. The lesson is the same one observability can't teach you: trust established at approval time is not trust at execution time.

An audit asks what your agent was permitted to do. This one is underestimated until it isn't. When a compliance team, a customer, or a regulator asks whether your agent operated within defined policy during a specific period, the answer requires evidence of governance having existed — not just logs of what the agent did. A complete execution trace with no underlying policy record doesn't demonstrate governance. It demonstrates that you watched. These are different things to an auditor.

Isn't This What My System Prompt Is For?

This comes up a lot, and it's worth addressing directly.

System prompts can instruct an agent to stay within certain behaviors. They are not governance. A system prompt is a soft constraint — the model reads it and attempts to follow it, and in the overwhelming majority of cases it does. But "the model attempts to follow it" and "the behavior is enforced" are not the same guarantee. A sufficiently unusual input, a prompt injection delivered through a tool result, or simply a long context where earlier instructions get underweighted can all produce behavior outside what your system prompt intended. (The OWASP Top 10 for LLM Applications ranks prompt injection as LLM01 — its number-one risk.) Injection that arrives inside a tool's output, rather than user input, is its own problem, examined in Prompt Injection Doesn't Stop at User Input.

Governance enforces constraints at the execution layer, independent of what the model decides. It doesn't ask the model to comply. It prevents non-compliant executions from proceeding. That's a different mechanism, and it's the one that matters when the model's judgment fails — which it will, given enough sessions.

When Does This Actually Become Urgent?

Honestly? Earlier than most teams act on it.

The intuition is that governance is an enterprise concern — something you bolt on when you have a compliance team asking questions or a customer requiring it. That intuition gets teams into trouble because the failure modes above don't wait for scale. A single agent session can generate an unauthorized action, a cost spike, or a compromised tool result. The blast radius is bounded by session; the risk exists from session one. The supply-chain dimension makes this worse: as the OpenClaw supply-chain compromise showed, the tool you connect on day one can become hostile on day forty without a single change to your own code.

The teams that retrofit governance after an incident consistently describe the same experience: the gap was obvious in hindsight, the work to close it was harder than it would have been upfront because they had to calibrate policies against live production traffic, and the incident that prompted it was more expensive than the governance implementation would have been.

The right time to add governance is when you're adding MCP tools, not after you've learned why you should have.

How Waxell handles this: Waxell governs MCP tools at the execution layer, above your agents — so you don't rewrite agent code to get it. Waxell Runtime enforces policy before each run against tools held in a versioned registry, with 50+ policy categories out of the box: a tool whose definition or schema changed since you approved it doesn't execute on its new identity until you've reviewed it, and per-session cost caps halt runaway sessions before the spend accrues. Tool-result inspection runs every MCP response through a controlled interface for injection patterns and schema anomalies before it enters agent context. For MCP servers you didn't build, Waxell Connect governs the agents and tools you don't own — no SDK, no rebuilds required. And because the telemetry underneath is immutable, you can produce a governance record — not just an execution log — for any session, any tool, any time period. Get Waxell access →

Frequently Asked Questions

Why do you need MCP governance?
Because observability and governance solve different problems. Observability tells you what your agent did — which tools it called, what they returned, what the session cost. Governance determines what your agent is permitted to do, enforced at runtime independent of the model's judgment. You can have complete observability and zero governance, and most MCP deployments do. The gap shows up when an agent takes an action nobody explicitly authorized, when a cost cap should have halted a session but didn't exist, or when an audit asks for evidence that agents operated within defined policy.

What's the difference between MCP observability and MCP governance?
Observability is the window; governance is the wall. Observability records what happened — tool calls, latency, errors, costs — and surfaces it for review. Governance enforces what's allowed to happen, before execution, independent of the model's reasoning. Catching a problem in your observability dashboard means the problem already occurred. A governance policy that blocks non-compliant execution means the problem doesn't occur.

Does observability detect MCP tool poisoning?
Generally no. Poisoning that lives in a tool's definition — including the parameter names, type fields, and enum values that CyberArk's Full Schema Poisoning research showed are all injection surfaces — looks like normal, healthy traffic to a monitoring layer. The API calls succeed and the latency is unremarkable. Detection requires validating the tool's full schema against an approved, versioned identity at execution time, which is a governance function, not an observability one.

Can't a system prompt handle this?
Not reliably. A system prompt instructs the model to stay within certain behaviors, and most of the time the model complies. But compliance is a soft guarantee: unusual inputs, prompt injection via tool result, or long contexts where earlier instructions get underweighted can all produce out-of-policy behavior. Governance enforces constraints at the execution layer, independently of what the model decides. It doesn't ask for compliance — it prevents non-compliant executions from proceeding.

When should you add MCP governance?
When you're connecting MCP tools, not after your first incident. The instinct to defer governance until scale or compliance pressure arrives is common and consistently leads to harder retrofitting work. Policies that would take an afternoon to define on a new deployment can take weeks to calibrate safely against live production traffic. The failure modes governance prevents — unauthorized actions, cost spirals, compromised tool definitions, compliance gaps — exist from the first production session.

Does MCP governance require rewriting my agent code?
No. Governance operates as a layer above your agents and tools — policies are defined in the governance plane and enforced at execution time without modifying agent logic or tool implementations. This is by design: governance that requires code changes to implement will be skipped when shipping pressure is high. A governance layer that sits independently of your code can be updated, tightened, or audited without touching the agents it governs.

Sources

Idan Dardikman, Koi Security, First Malicious MCP in the Wild: The Postmark Backdoor That's Stealing Your Emails (September 2025) — https://www.koi.ai/blog/postmark-mcp-npm-malicious-backdoor-email-theft
Simcha Kosman, CyberArk Labs, Poison Everywhere: No Output from Your MCP Server is Safe (May 2025) — https://www.cyberark.com/resources/threat-research-blog/poison-everywhere-no-output-from-your-mcp-server-is-safe
OX Security, The Mother of All AI Supply Chains: A Critical, Systemic Vulnerability at the Core of MCP (April 2026) — https://www.ox.security/blog/the-mother-of-all-ai-supply-chains-critical-systemic-vulnerability-at-the-core-of-the-mcp/
Wang et al., MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers (August 2025) — https://arxiv.org/abs/2508.14925
OWASP, Top 10 for Large Language Model Applications — LLM01: Prompt Injection (2025) — https://genai.owasp.org/llmrisk/llm01-prompt-injection/

Agentic Governance, Explained

Waxell blog cover: AI agents processing employee PII without a policy

AI Agents and Employee PII: The Policy Gap [2026]

34.8% of corporate data employees put into AI tools is sensitive. Meta's MCI shows the stakes. Here's what a real employee PII policy for agents actually covers.

Logan Kelly

Jul 3, 2026

Waxell blog cover: MCP tool description poisoning enterprise governance

Poisoned MCP Tool Descriptions Leak Agent Data [2026]

Microsoft warns poisoned MCP tool descriptions redirect agents to exfiltrate data silently. The mechanism, why it persists, and the controls that stop it.

Logan Kelly

Jul 3, 2026

Waxell blog cover: GuardFall AI coding agent shell injection 2026

GuardFall Shell Injection: 10 of 11 AI Coding Agents [2026]

GuardFall defeats shell guards in 10 of 11 AI coding agents using decades-old bash tricks. Named tools: Aider, Cline, Goose, Plandex, and more.

Logan Kelly

Jul 2, 2026

Waxell blog cover: Copilot billing shock agentic cost enforcement 2026

Copilot Billing Shock: $29 Plans Now Cost $750 [2026]

GitHub's first Copilot token billing cycle ended June 30. Agentic sessions hit 10x–50x cost spikes. Why dashboards don't fix this—and what does.

Logan Kelly

Jul 1, 2026

AI Agents and Employee PII: The Policy Gap [2026]

34.8% of corporate data employees put into AI tools is sensitive. Meta's MCI shows the stakes. Here's what a real employee PII policy for agents actually covers.

Logan Kelly

Jul 3, 2026

Poisoned MCP Tool Descriptions Leak Agent Data [2026]

Microsoft warns poisoned MCP tool descriptions redirect agents to exfiltrate data silently. The mechanism, why it persists, and the controls that stop it.

Logan Kelly

Jul 3, 2026

Waxell

Waxell provides observability and governance for AI agents in production. Bring your own framework.

Product