Waxell

Product

Compare

START FREE

Waxell

Logan Kelly

Jun 22, 2026

Multi-Agent Governance: Why Treating Every Agent the Same Breaks Coordinator, Planner, and Worker Systems

Most governance tools apply identical policies to every agent. Here's why that breaks at coordinator, planner, and worker layers — and what actually works.

Waxell blog cover: multi-agent governance coordinator planner worker roles

The dominant pattern in production multi-agent AI in 2026 is orchestrator-worker: a coordinator agent dispatches specialized subagents to handle specific tasks, results flow back up the chain, and the system presents a unified output. Most governance tools don't care which role each agent plays.

The same input validation rules apply to the coordinator that apply to the deepest worker. The same cost limits run at the planner layer as at the execution layer. If your governance policy says "flag PII in any agent output," it fires identically whether the agent making the call is the one deciding what to do next or the one doing it. That's not governance — it's coverage theater.

A 2025 study analyzing 1,600+ annotated execution traces across seven production multi-agent frameworks (Cemri et al., arXiv:2503.13657) identified 14 distinct failure modes. The researchers clustered them into three categories: system design issues, inter-agent misalignment, and task verification failures. The taxonomy is notable precisely because these failure categories map directly to the three architectural layers that most governance tools treat identically — coordinator, planner, and worker.

Multi-agent governance is the set of policies, controls, and oversight mechanisms that manage behavior across a system of collaborating agents — not just within any individual one. When the governance plane treats all agents as interchangeable, it secures the components while leaving the architecture exposed.

The Three Layers That Get Identical Treatment

Orchestrator-worker architectures contain at least three functionally distinct agent roles, each with a different relationship to risk:

The coordinator owns the execution plan. It decides which agents run, in what order, with what context. It allocates tasks, synthesizes results, and determines whether a subtask output is acceptable enough to proceed. The coordinator's failure mode isn't usually a harmful action — it's a harmful decision: routing a sensitive request to the wrong subagent, passing unvalidated user input downstream as if it were trusted, or continuing to iterate when the system should have stopped and escalated.

The planner translates high-level intent into structure. It decomposes goals, surfaces dependencies, and proposes execution sequences. The planner's failure mode is scope expansion: creating subtasks that, in aggregate, exceed the authorization of the original request. A document review request that the planner decomposes into 40 subagent calls — each individually within scope — may produce a cost and data-access footprint the human who initiated the request never anticipated and would never have approved explicitly.

The worker executes. It calls tools, reads data, writes outputs. Workers are the agents that touch production systems, which is why they receive almost all governance attention. But they're often the last agents in a chain that began with an unvalidated coordinator decision or an overbroad planner decomposition. Governing at the worker layer while leaving the coordinator and planner uncontrolled is roughly equivalent to putting a lock on your server room door and leaving the network unrestricted.

Why Coordinator-Level Governance Is a Different Problem

The coordinator's job is to decide. That's not the same as acting — but in multi-agent systems, decisions are upstream of actions. A bad coordinator decision propagates through every agent it dispatches.

Consider a financial services workflow: a coordinator receives a user query and routes it to a data-retrieval subagent, a calculation subagent, and a report-generation subagent. Each worker runs within its individually scoped permissions. But the coordinator's decision to pass the raw user query — including whatever it contained — directly to the subagent with database access was never evaluated by the governance layer. The coordinator made a trust decision that the governance system never saw.

This is the structural mechanism behind prompt injection in multi-agent systems. Malicious content doesn't need to bypass a worker's input filter. It needs to convince the coordinator to route it somewhere it can execute. A 2025 research report from the Cooperative AI Foundation (Hammond et al., arXiv:2502.14143) identified miscoordination, conflict, and collusion as the three primary failure modes in multi-agent AI systems — each arising at the orchestration and coordination layer, not at the individual execution layer.

Cross-agent policy enforcement applied only at the worker layer misses all three of these. The governance plane needs to sit above the coordinator, not alongside the workers — evaluating coordination decisions before they dispatch, not after workers have already executed.

The Planner's Scope Problem

A planner that decomposes a task aggressively isn't necessarily making an unsafe individual decision. Each subtask may be individually authorized. The problem is that scope validation is almost universally implemented at the tool-call level: "is this agent allowed to call this API?" — not at the plan level: "is the sum of all subtasks proportionate to what was originally authorized?"

This matters because planner scope decisions are cumulative and compounding. In Cemri et al.'s taxonomy, "system design issues" — which include how agents are scoped and how tasks are decomposed — were the largest failure category across the 1,600+ traces studied. The most common failure wasn't a worker acting out of bounds. It was a system design that gave the planner no effective constraint on how far it could decompose a task.

Planner-level governance requires enforcing policies at the plan structure, not just at execution time. Without this, planners become the second multi-agent governance blind spot: less dramatic than coordinator prompt injection, but responsible for a larger share of runaway costs and unintended scope expansion in production systems.

The Worker's Trust Problem and the Parent-Child Trace Gap

Workers in orchestrator-worker systems receive their instructions from coordinators and planners — agents they have no independent mechanism to authenticate. If a coordinator has been compromised by prompt injection, every worker that trusts its output inherits that compromise. The governance layer is the only component in the system that sits outside this trust chain and can validate instructions at each handoff point.

This creates a second structural requirement: the governance plane must observe and enforce at every inter-agent boundary, not just at the entry point of the system and the final output.

This also creates the parent-child trace problem. Most tracing and observability tools can tell you which agents were called and in what order. What they struggle with is attributing a downstream worker action back to the coordinator decision that authorized it — especially in systems with dynamic routing, where dispatch decisions aren't predetermined in code.

An agent registry that catalogues each agent's authority, role, and lineage is the prerequisite for this kind of attribution. Without it, a multi-agent execution log is a chronological list of events with no causal structure. You can read it, but you can't use it to answer the question that matters in a post-incident review: which coordinator decision authorized the chain of actions that led to this outcome?

The trace problem is especially acute in multi-agent systems that include external agents — vendor-built tools, third-party MCP servers, orchestrators built by a different team. These agents operate outside any SDK you control. Their behavior is observable only if there is a governance layer that sits above the entire system, not just around the components you built.

How Waxell Handles This

Waxell Runtime applies policy enforcement at the governance plane level — above the coordinator, not alongside the workers. This means different rules can be applied at different points in the agent hierarchy. Coordinator-level decisions, planner decompositions, and worker tool calls carry different risk profiles, and Waxell Runtime's 50+ policy categories out of the box cover pre-execution, mid-execution, and post-execution enforcement with no rebuilds required.

For multi-agent systems that include agents you didn't build — vendor-built orchestrators, third-party subagents, external tools integrated via MCP — Waxell Connect governs the agents you didn't build without requiring SDK instrumentation or code changes from those external systems. It enforces governance at the boundary between agents you control and agents you don't, which is precisely where the trust gap in orchestrator-worker systems is widest.

The multi-agent execution logs Waxell captures preserve the causal structure of the system: coordinator decision → planner decomposition → worker execution, with governance check results at each transition point. That's what a multi-agent audit trail actually needs to contain — not a flat event list, but a causally ordered record that maps each action back to the authority that authorized it.

FAQ

Why do most governance tools treat all agents identically in a multi-agent system?
Because the dominant tools — LangSmith, Arize, Braintrust — were designed for single-agent observability and extended to multi-agent use. They instrument at the LLM call or tool call level, which captures what each agent did but doesn't enforce different rules based on each agent's role. Governance that differentiates by role requires a layer that sits above the orchestration framework, not inside each individual agent.

What failure modes emerge specifically at the coordinator layer?
Prompt injection routing (where malicious content convinces the coordinator to dispatch to a sensitive agent), authorization bypass (where the coordinator passes unvalidated user input as if it were a trusted system instruction), and uncontrolled iteration (where the coordinator continues spawning subagents past the point where a human would have stopped). None of these are caught by worker-layer policy enforcement.

What is the "scope expansion" problem with planners?
In orchestrator-worker systems, planners decompose high-level goals into subtasks. Each subtask may be individually authorized, but the aggregate footprint — the number of API calls, the volume of data accessed, the total cost incurred — may far exceed what the user or operator intended. Planner-level governance requires enforcing constraints on the plan structure, not just on individual executions.

How does Waxell Connect help govern external agents in a multi-agent system?
Waxell Connect governs agents you didn't build — third-party orchestrators, vendor agents, MCP-native integrations — without requiring those systems to instrument your SDK or change their code. It enforces governance at the boundary where your system interacts with external agents, which is where the trust chain has its widest gap in any orchestrator-worker architecture.

What should a multi-agent audit trail capture that most logs miss?
Causal structure: which coordinator decision authorized which planner action, which planner decomposition led to which worker calls, and what policy checks ran at each handoff. Most tracing tools produce flat event lists ordered by timestamp. A usable multi-agent audit trail is a causally structured record that lets you trace any outcome back to the orchestration decision that originated it.

Do EU AI Act requirements differ for multi-agent systems?
Under EU AI Act Annex III (effective December 2027 under the EU Digital Omnibus amendment), high-risk AI systems must maintain human oversight, detailed logging, and transparency about decision-making. For multi-agent systems in regulated sectors, this effectively requires governance coverage at the coordination and planning layers — not just at execution — because the decisions that create compliance risk originate at the orchestrator, not the worker.

Sources

Cemri, M., Pan, M.Z., Yang, S., et al. (2025). "Why Do Multi-Agent LLM Systems Fail?" arXiv:2503.13657. https://arxiv.org/abs/2503.13657. Publication date: March 17, 2025 (v1); revised October 26, 2025 (v3).
Hammond, L., Chan, A., Clifton, J., et al. (2025). "Multi-Agent Risks from Advanced AI." arXiv:2502.14143. Cooperative AI Foundation Technical Report #1. https://arxiv.org/abs/2502.14143. Publication date: February 19, 2025.
EU AI Act Digital Omnibus Amendment. Annex III effective date: December 2027.

Agentic Governance, Explained

Waxell blog cover: DeepMind AI control roadmap insider threat governance

DeepMind Treats Its AI Agents as Insider Threats [2026]

DeepMind's AI Control Roadmap treats deployed agents as insider threats. Here's the defense-in-depth framework it established — and how Waxell Runtime enforces it without rebuilds.

Logan Kelly

Jun 23, 2026

Waxell blog cover: SOC 2 attribution gap for AI agents

SOC 2 for AI Agents: The Attribution Gap [2026]

SOC 2's CC6.3 requires privileged actions traced to an accountable individual. AI agents act without human authorization. Here's how to close the gap.

Logan Kelly

Jun 19, 2026

Waxell blog cover: GitHub AI agent infrastructure crisis

GitHub's AI Agent Crisis: What 9 Outages Cost [2026]

In May 2026, GitHub logged 9 outages and added AWS capacity to stay online. Here's why unbounded AI coding agents break production — and what enforcement prevents.

Logan Kelly

Jun 17, 2026

Waxell blog cover: AI agent cost audit framework showing 5 steps to find fleet budget waste

AI Agent Cost Audit: 5 Steps to Find Fleet Budget Waste

A developer left 7 AI agents running for 2 hours and burned $200. Here's the 5-step cost audit that finds fleet budget waste before your next invoice arrives.

Logan Kelly

Jun 17, 2026

DeepMind Treats Its AI Agents as Insider Threats [2026]

DeepMind's AI Control Roadmap treats deployed agents as insider threats. Here's the defense-in-depth framework it established — and how Waxell Runtime enforces it without rebuilds.

Logan Kelly

Jun 23, 2026

SOC 2 for AI Agents: The Attribution Gap [2026]

SOC 2's CC6.3 requires privileged actions traced to an accountable individual. AI agents act without human authorization. Here's how to close the gap.

Logan Kelly

Jun 19, 2026

Waxell

Waxell provides observability and governance for AI agents in production. Bring your own framework.

Product

Company