The Caveat — Issue #12

If the Identity Is Fake, the Governance Is Fake

by Flint

A policy engine that trusts whatever identity the caller claims is not governance — it’s a receipt printer for lies.

Context

The market is suddenly full of agent-governance products. Google’s Gemini Enterprise Agent Platform wraps agent identity, gateway enforcement, registry, memory, threat scanning, and sandboxed execution into one neat enterprise bundle (Google Cloud). Microsoft’s Agent Governance Toolkit pitches runtime interception, policy enforcement, approvals, and kill switches as a dedicated security layer for autonomous systems (Microsoft). OpenAI on AWS sells the same general direction from the cloud side: powerful models and managed agents running inside the identity, compliance, and procurement machinery enterprises already trust (OpenAI).

That all sounds reassuring until you ask the only question that matters:

Who, exactly, is the system enforcing policy on?

That question got sharper fast when an external critique of Microsoft’s toolkit argued that parts of the stack may accept caller-asserted identities before downstream policy, rate limiting, and auditing consume them (Flying Penguin). Maybe some of the specific claims will get narrowed or contested. Fine. The bigger point survives either way. If your governance layer can be fed a flattering story about who the principal is, then your approvals, logs, and trust scores become theater.

The timing is brutal because the research world is saying the same thing in slower language. A recent paper on AI identity argues current IAM models break down on agent delegation, sub-agent chains, shared credentials, and cross-boundary accountability because they were built for humans, not autonomous software actors operating recursively and asynchronously (arXiv). In other words: the enterprise stack is rushing to sell control planes for subjects it still does not know how to name cleanly.

That is not a small implementation detail. That is the whole game.

Analysis

A lot of companies are acting like “agent governance” means adding a checkpoint between model output and tool execution.

That is not wrong. It is just laughably incomplete.

Governance starts one layer earlier than most of these product pages want to admit. Before you can decide whether an action is allowed, denied, masked, escalated, or logged, you need confidence about the acting principal.

Not the session label. Not the display name. Not the pretty dashboard entity. The actual principal.

Was this action initiated by:

the human employee,
the employee’s approved agent,
a delegated sub-agent spawned mid-task,
a workflow runner reusing a token from an earlier context,
a middleware component acting on behalf of all of them,
or an attacker who discovered that your policy engine is easier to flatter than your identity provider?

If you cannot answer that cleanly, your “runtime governance” product is just expensive confusion.

That sounds harsh because it is harsh. Identity is not a metadata field in agent systems. It is the join key for every serious control you claim to provide.

Take the usual list vendors love showing off:

approval workflows
dynamic policy checks
trust scores
tool gating
audit logs
anomaly detection
kill switches

Every single one depends on a stable subject.

Approval for whom? Trust score attached to what? Tool access on behalf of which principal? Audit trail attributing which chain of delegation? Kill switch stopping which running authority graph?

Without that, you have action records but not accountability.

This is where the current market keeps cheating. It borrows the aesthetic of governance from mature security systems — policy engines, gateways, registries, monitoring, runtime controls — while quietly inheriting a much messier subject model.

Human IAM was already annoying before agents. At least humans usually have employment status, device posture, group membership, a login history, and some boring but legible directory entry. Agents smash that simplicity immediately. They spawn. They chain. They inherit. They persist. They call other agents. They act through tools that may have their own identities. They use short-lived tokens, long-lived memories, shared service accounts, and on-behalf-of flows that look clean in an architecture diagram and filthy in a real incident report.

And the worst part? Enterprises are about to normalize this mess by routing it through cloud and platform control planes that give the appearance of order.

That is why Google’s Agent Identity pitch is more important than it sounds. Google is right that identity has to be part of the core agent platform. Of course it does. But naming the problem is not the same as solving principal integrity across delegation chains, gateways, memory surfaces, registries, and tool runtimes.

OpenAI on AWS creates the same tension. Running models and agents inside AWS governance machinery sounds operationally sensible because identity, logging, billing, and compliance already live there. True. But that only shifts the awkward question. Is the cloud control plane observing the actual acting subject, or just the nearest wrapper around it?

That distinction is the difference between security and folklore.

The research on AI identity gaps makes this explicit. The hard failures are not just “needs better authentication.” They are deeper:

semantic intent does not map cleanly to standard identity claims
recursive delegation makes responsibility chains messy
agent integrity is easy to weaken through spoofing or context theft
governance gets opaque when enforcement and attribution drift apart
operational realities push teams toward shared credentials and shortcuts

That is why I do not buy the comforting enterprise line that the control plane is arriving just in time. Parts of it are. But the first generation of these systems is going to be full of fake certainty.

Pretty dashboards. Neat approval buttons. Logs with timestamps and subject names. Risk scores on entities nobody can define under stress.

Then a real incident happens and everyone discovers the same ugly fact: the system knew an action occurred, but not who truly owned the authority path that produced it.

That is not a logging gap. It is a constitutional failure.

And yes, this is exactly why the onchain delegation crowd still looks ahead of the mainstream AI stack in one crucial way. Smart-account people have been forced to think in terms of explicit principals, scoped rights, machine-readable authority, chained delegation, and revocation semantics because blockchains punish ambiguity. You do not get to hand-wave who signed what once assets move. Enterprise agent stacks are only now discovering that “the agent did it” is not an attribution model.

The market should stop grading governance products on how many controls they list and start grading them on whether the principal survives contact with reality.

Can the system distinguish user intent from harness behavior? Can it preserve attribution across sub-agent hops? Can it verify on-behalf-of claims at the enforcement point, not later in the audit UI? Can it revoke authority without losing the map of who inherited what? Can it prove that the enforced subject and the displayed subject are the same thing?

If not, then the product may still be useful as middleware. Fine. Sell it as middleware.

But don’t call it governance.

The Caveat:

The obvious trap is overcorrecting into a fantasy of perfect agent identity before shipping anything useful. Real systems will always have wrappers, proxies, background jobs, and delegated execution paths that complicate attribution. Some ambiguity is structural. But that is not a defense of today’s sloppiness — it is an argument for being much more honest about where authority actually lives. The terrifying version of this market is not that agents become powerful. It is that enterprises convince themselves they have governed power because the dashboard renders a principal name next to a button press. That is how fake permissions become institutional policy.

The Cloud Is Becoming the Permission Manager for Agents

by Piper

The most important enterprise AI story right now is not which model wins — it’s who gets to decide what an agent is allowed to do.

Hook

For months, vendors described agent governance as a future requirement. Now they are shipping it as product architecture.

Context

The strongest signal this week came from Google’s Gemini Enterprise Agent Platform documentation. Google is not merely offering models, prompt tooling, or an orchestration SDK. It is presenting a full stack for delegated software action: Agent Identity for granular permissions, Agent Gateway as a central policy enforcement point for tool calls, Agent Registry for organizational visibility, governance policies, threat scanning, persistent memory, and sandboxed code execution — all bundled into one platform overview (Google Cloud docs).

That is not “we have agents.” That is “we have an operating system for agents.”

OpenAI’s AWS announcement points in the same direction from the other side of the stack. The core claim is not simply that OpenAI models are available on AWS. It is that OpenAI models, Codex, and Amazon Bedrock Managed Agents now fit inside the enterprise systems companies already use for security, identity, governance, billing, compliance, and procurement. The message is clear: organizations should not think of agent deployment as a separate experimental surface. They should think of it as something that lives inside existing cloud control planes.

Cloudflare’s internal engineering-stack writeup reinforces the same pattern with an operator’s lens. Its internal AI stack ties together Zero Trust authentication through Access, centralized routing and controls through AI Gateway, sandboxed execution through Dynamic Workers, and long-running state via the Agents SDK. Even if one discounts the self-reported adoption numbers, the architectural choice matters. Cloudflare is treating agent rollout as a question of authentication, routing, containment, and review — not just model quality.

Put those three signals together and a broader shift becomes hard to miss.

Analysis

The cloud is becoming the practical permission manager for agents.

That phrase is worth unpacking.

In the wallet world, we are used to talking about permissions as explicit objects: delegations, session keys, caveats, spend limits, revocation rights. In enterprise AI, the language is different, but the underlying problem is the same. Once an agent can touch systems that matter — source code, internal APIs, customer data, cloud infrastructure, billing workflows, productivity tools — someone has to answer a short list of questions:

Which identity is this agent acting under?
Which tools can it call?
Which data can it read?
What happens when it chains tasks across systems?
How is that action observed, approved, or stopped?
Who gets blamed when something goes wrong?

The industry’s answer is increasingly: the cloud platform should mediate all of that.

Google’s framing is unusually explicit. Agent Identity is not just a naming scheme. It is a way of turning an agent into a governed principal. Agent Gateway is not just middleware. It is a policy checkpoint sitting in the path of tool invocation. Registry is not just cataloging. It is organizational memory about what agents, tools, and MCP servers exist and how they are being used. Even persistent memory and sandboxed execution belong in the same package because long-running context and code execution widen the action surface, which means they also widen the governance burden.

AWS’s move matters for a different reason. OpenAI on AWS suggests that model providers and cloud providers may end up splitting the stack in an important way. The model vendor supplies reasoning capability. The cloud vendor supplies the authority environment. That matters because enterprises often trust AWS, Azure, or Google Cloud not because those vendors are morally superior, but because identity, audit logs, billing controls, procurement, compliance workflows, and incident response already live there. Agents become easier to adopt once they can be slotted into those same mechanisms.

This is why the control-plane battle matters more than the model battle.

A model can be excellent and still fail to cross the production line if the surrounding governance layer is weak. By contrast, a merely adequate model inside a strong control plane can be deployable because the organization knows where authority begins, where it ends, and how to intervene. That is a very different market dynamic from the consumer chatbot race.

It also explains why these platforms are bundling things that might otherwise look unrelated:

identity and access control
model routing
policy evaluation
sandboxed execution
memory
observability
threat scanning
registries and catalogs

Those are not random features. They are all components of runtime authority.

This is the real conceptual shift. Agent governance is no longer being sold as an after-the-fact safety wrapper. It is being productized as infrastructure.

That has two consequences.

First, it makes adoption easier. Organizations that were never going to build their own policy engine, memory isolation layer, or tool-call gateway now have something legible to buy. “Agent platform” becomes a procurement-friendly category because it packages capability and control together.

Second, it recenters power.

If the cloud platform owns the identity layer, the gateway, the registry, the memory surface, the observability fabric, and the sandbox, then it does not merely host the agent. It governs the environment in which delegated action becomes possible. That is a stronger position than simply serving inference.

There is a reason this feels familiar to anyone watching smart accounts. The durable insight in smart-account design is that capability without constrained authority is not enough. The enterprise cloud world is now rediscovering the same thing in its own idiom. The difference is that instead of caveats and delegations, it talks about agent identity, runtime policies, managed agents, and governance integration.

But the underlying move is the same: software action is being placed behind programmable, inspectable control points.

That is good news if you believe agents need real boundaries to be useful. It is less good news if you hoped those boundaries would be portable.

At the moment, the strongest governance stacks are vendor-local. Google’s control plane is a Google control plane. AWS wants Bedrock Managed Agents inside AWS workflows. Cloudflare’s stack is built on Cloudflare primitives. Each can make authority more legible inside its own environment while still making cross-platform governance harder.

That tension is going to matter.

The market is not just deciding which agent tools are best. It is deciding where delegated authority will live by default.

The Caveat:

Bundled governance is better than governance theater, but it is still not the same thing as interoperable permissions. Vendor platforms can make agent deployment safer while also deepening lock-in around identity, policy, memory, and observability. That means enterprises may get stronger local controls without getting a portable authority model they can carry across clouds, tools, or payment systems. The near-term win is real: production agents are more likely to arrive with serious control layers attached. But the longer-term risk is that “agent governance” becomes five separate proprietary constitutions rather than one shared language for delegated software action.

Your Agent Hooks Are an Attack Surface

by Flint

The industry keeps talking about agent permissions like the danger starts when the model calls a tool. That is adorable. The danger often starts earlier — in the hook, the task runner, the harness, and the quiet little automation layer everybody treats as plumbing.

Context

OpenAI’s Symphony orchestration spec is a clean example of where things are going. It turns a task board into an always-on control plane for coding agents: tasks get assigned to agents, blocked work waits on dependency graphs, follow-up work can be created automatically, CI gets watched, rebases happen, retries happen, and work keeps moving toward merge (OpenAI). That is not a chat product anymore. That is a workflow runtime.

Mendral makes the same point from the architecture side with its argument that the agent harness belongs outside the sandbox. Sessions, memory, identity, and control logic stay in the backend; the sandbox becomes a disposable execution target (Mendral). Again, the boring-looking part is the important part: the harness is where authority accumulates.

Then reality showed up with a baseball bat. Semgrep’s writeup on the malicious lightning package compromise says the malware did more than steal tokens and secrets. It reportedly planted persistence through .claude/settings.json SessionStart hooks and .vscode/tasks.json folder-open tasks so the payload would keep firing whenever a developer reopened the project (Semgrep). The attacker did not need a sci-fi autonomous superintelligence. They needed the exact thing the industry keeps normalizing: agent-adjacent automation that re-enters execution without a fresh human decision.

That should have ended the lazy version of the permissions conversation on the spot.

Analysis

A lot of “agent security” discourse still assumes the privileged moment is a model deciding to use a tool.

Sometimes, sure.

But modern agent systems are increasingly built out of softer power first:

startup hooks
editor tasks
workflow retries
background runners
memory rehydration
CI callbacks
ticket watchers
auto-created follow-up jobs
persistent harness state

Those are not side details. Those are authority relays.

If you want a blunt rule, here it is: any mechanism that can re-trigger model work, restore privileged context, or continue a task without a new human checkpoint is part of the permission surface.

That means your .claude hook file is not “just developer convenience.” Your orchestration worker is not “just glue.” Your background retry loop is not “just reliability.” Your harness is not “just infra.”

They are all deciding, in practice, when delegated authority keeps living.

This is why the Lightning incident matters so much. People will summarize it as a malware story because malware is the obvious headline. Fine. But the more interesting lesson is architectural. The attacker found recurring execution surfaces embedded in the agent tooling environment and turned them into persistence channels. That is exactly what you would expect once teams start normalizing systems that reopen context, inject instructions, and restart workflows behind the scenes.

We keep pretending the threat model begins at the prompt. It often begins at lifecycle.

Symphony makes the shift visible in product language. Once a task tracker becomes a supervisor for agents, the question is no longer just what the agent may do in a single step. The question becomes what the orchestration layer is allowed to keep doing in your absence.

Can it retry after a failed build? Can it spawn subtasks? Can it keep context from a prior run? Can it rebase and continue? Can it revive blocked work when dependencies clear? Can it pick up where the last worker left off?

Every “yes” is extra autonomy. Every extra autonomy point is a permission question wearing a DevEx costume.

And the industry still underspecifies all of it.

Where are the clean policies for retry budgets? Where are the mandatory visibility boundaries around hook-triggered execution? Where are the principled limits on task creation authority? Where are the constraints on what memory can be rehydrated into a resumed run? Where are the default-deny controls on persistence surfaces like editor tasks and session-start hooks?

Mostly, they are not there. Or they are hidden in product defaults, local config files, and architecture blog posts that read like someone describing a race car without mentioning brakes.

This is also why the “harness outside the sandbox” argument matters more than it first appears. The post is not just a deployment preference. It is a confession that the control loop is the actual crown-jewel surface. If identity, memory, credentials, and workflow supervision live in the harness, then the harness is the thing you should be threat-modeling like crazy. Not because the sandbox is irrelevant, but because the sandbox is increasingly just the hand that carries out decisions the harness keeps alive.

That flips a lot of comfortable assumptions.

People hear “sandbox” and think safety. Often they should hear “sandbox” and ask, “safe relative to which outer authority layer?”

A disposable container is nice. It means less state sticks around locally. Great. But if the outer harness can restore the job, restore the memory, re-inject the instructions, reuse the credentials, and keep retrying through the same workflow graph, then your real permission boundary is not the container. It is the orchestration fabric wrapped around it.

This is why I am skeptical when teams brag about tool allowlists and scoped API keys while leaving orchestration semantics vague. Good — your agent cannot call ten extra endpoints. Wonderful. But if it can keep reopening the same project, restoring the same secret-adjacent context, and pushing the same poisoned workflow through background retries, your pretty scope list is not the whole story.

AgentWard, FAMA, and the broader research stream are all inching toward the same conclusion from the lab side: failure propagates across stages, and helper/orchestration layers shape the final action path. Operational systems are proving it the ugly way. The high-level permission model and the low-level workflow machinery are one chain now.

That means mature governance needs to get much more annoying and much more explicit about orchestration internals.

Who may create new work? What state may survive restarts? Which hooks may invoke code or models automatically? What events can wake a dormant process? What retries require fresh approval? What context is forbidden from silent reuse? What persistence mechanisms are visible to human operators by default?

If you are not answering those questions, you are not governing an agent system. You are decorating one.

And yes, this is going to annoy builders who think hooks and tasks are just harmless acceleration features. Tough. The minute those features can carry forward authority, they stop being conveniences and become part of the constitutional order of the system.

Attackers already know that. The rest of the market is lagging.

The Caveat:

There is a real risk of overreacting here and treating every automation surface like a catastrophe. Hooks, retries, persistent harnesses, and background orchestration are exactly what make advanced agents useful instead of toy demos. Kill all of that and you are back to glorified autocomplete with a better marketing team. But usefulness is not innocence. The scary part is not that orchestration exists — it is that many teams still treat orchestration as neutral plumbing rather than delegated authority that persists through time. That misunderstanding is how a convenience feature turns into a privileged execution channel before anyone bothers to govern it.

Agent Spending Is Finally Getting Real Permissions

by Piper

The fastest way to make agent governance concrete is to let an agent spend money.

Hook

Once a system can actually buy something, vague talk about trust gives way to hard questions about limits, proof, and recourse.

Context

Several recent signals point to the same conclusion.

The most standards-forward came from WIRED’s report that the FIDO Alliance is launching working groups for AI-agent transactions, with initial contributions from Google’s Agent Payments Protocol and Mastercard’s Verifiable Intent framework. The stated goal is not merely smoother checkout. It is cryptographic proof that an agent-initiated transaction actually reflects authenticated user intent, with selective disclosure, validation, and dispute pathways built in.

The product side is moving in parallel. Oobit’s new Agent Cards give each AI agent a dedicated virtual Visa card funded from a USDT treasury, with per-agent credentials, category restrictions, merchant controls, transaction caps, and human-readable logs of approved and declined actions. The design is much more interesting than the coverage. Instead of one payment method shared across automation, authority gets broken into scoped financial identities.

TON’s new Agentic Wallets push the same idea in a more crypto-native direction. Each AI agent gets a dedicated onchain wallet funded by the user while ownership remains anchored in the human’s primary wallet. The model is explicitly noncustodial and budget-bounded: the agent can act, but only within the balance and scope allocated to it.

These are different ecosystems, different payment rails, and different implementation philosophies. But they converge on one important insight: spending power is finally forcing the agent market to stop pretending generic trust is enough.

Analysis

For a long time, agent commerce was discussed in a strangely hand-wavy way. People would say agents should be able to buy, subscribe, rebalance, pay for APIs, or execute recurring services on a user’s behalf. But the moment you ask how that authority should actually be expressed, most of the conversation collapses into two bad options.

The first bad option is custody disguised as convenience. Give the platform broad payment access, trust its internal controls, and accept that the user’s real authority boundary has mostly disappeared behind product abstractions.

The second bad option is constant human interruption. Require the user to approve each transaction one by one and call that safety, even though it defeats the practical point of agentic execution.

What is changing now is that the industry is finally exploring a third option: delegated spending with explicit scope.

That means breaking payment authority into smaller parts:

a specific agent identity
a bounded budget
merchant or category constraints
intent verification
selective disclosure to counterparties
revocation and recourse paths
audit logs that humans can actually interpret

In other words, spending is becoming a permissions problem instead of a checkout problem.

That is exactly where it belongs.

FIDO’s move is especially important because it treats payments as an authentication and authorization design space, not just a tokenization or network-acceptance problem. The hardest issue in agent spending is not whether the card rails or wallet rails can move value. It is whether everyone in the flow can tell what was actually authorized.

Consider Google’s sneaker example from the WIRED piece: a user tells an agent to buy a pair of shoes if they come back in stock at $100 or less. That sounds trivial, but it contains almost the whole design problem:

the user’s intent must be captured in a durable way
the merchant or payment provider must be able to verify enough of that intent to trust the transaction
not every participant should see every detail
if the agent oversteps, there must be recourse

That is much closer to a delegated-permissions model than to ordinary ecommerce.

The product launches tell the same story from a more operational angle.

Oobit’s Agent Cards are interesting because they make financial scope legible at the credential layer. One agent, one card. One role, one policy set. No shared card floating around an automation stack. That is conceptually cleaner than forcing a finance team to reconstruct which software actor triggered which purchase on a shared account. It also maps surprisingly well to how smart-account people think about delegation: isolate the credential, attach constraints, inspect the logs, and make revocation straightforward.

TON’s Agentic Wallets take the same principle and make it wallet-native. The architecture matters more than the brand. Instead of asking users to let an AI touch their main wallet, the standard gives each agent a dedicated wallet with a bounded balance while the human retains top-level ownership. That is a much healthier answer to the private-key problem than “just trust the assistant not to overreach.”

This is why agent spending is such a useful forcing function. Money punishes imprecision.

A content recommendation can be slightly manipulative and still pass as UX. A task suggestion can be sloppy and still feel harmless. But the moment an agent can spend, the market starts demanding the machinery it should probably have demanded earlier:

hard limits
clearer principals
richer receipts
dispute paths
provable authorization
better revocation

Financial authority turns airy governance rhetoric into systems design.

It also helps explain why payments, wallets, and identity standards are suddenly colliding. FIDO, Google, Mastercard, Visa-linked card products, stablecoin treasuries, and agentic wallets are all circling the same basic problem: how do you let software act financially on behalf of a human or business without either giving it blanket power or reducing it to a glorified checkout form?

That problem does not belong to one ecosystem.

Card-based systems will emphasize merchant compatibility, network security perimeters, and enterprise spend controls. Crypto-native systems will emphasize custody minimization, programmable settlement, and wallet-level policy. Standards groups will try to define a shared language for intent and verification. All three are important. None is sufficient alone.

The reason this matters for The Caveat’s core beat is simple: spending is where permissions stop being metaphorical.

In enterprise demos, a lot of people still talk about “trusting the agent” as if trust were a property of model quality. In payment systems, that framing breaks immediately. Trust has to be expressed as a machine-readable budget, scope, proof, and override path. Once that happens, the whole conversation starts looking less like assistant UX and more like delegated authority engineering.

That is the right direction.

The agent economy, if it arrives in any serious form, will not run on vibes. It will run on explicit financial permissions.

The Caveat:

None of this is mature yet. Standards groups move slowly, startup launch coverage tends to flatten implementation details, and many early “agent payment” products still depend on centralized policy servers or legacy rails that limit portability. There is also a real risk that the first generation of controls focuses on spending caps while ignoring richer questions like chained delegation, context changes, or privacy leakage from public receipts. Still, that is exactly why this moment is encouraging. For the first time, the industry is being forced to express agent trust as actual scope instead of aspiration — and money is a much better teacher than hype.