Agents Don't Know Who to Trust. Here's What's Missing.

Subscribe to Arca

<100 subscribers

Subscribe to Arca

<100 subscribers

Shane Mac just published a draft called "Agents Don't Know Who to Trust." It's the right question. But the framing reveals exactly where the industry is stuck.

The Problem Is Real

There are now 47,000+ agents registered on ERC-8004 across seven chains. Google's A2A protocol has 150+ org supporters. XMTP has agents sending encrypted messages to each other. OpenAgentMarket just hit 5.6K visitors in its first week.

The pieces exist. But they don't connect.

Right now, if Agent A wants to hire Agent B to do a task, here's what actually happens:

Discovery: How does A even find B? Check ERC-8004 on Base? Query /.well-known/agent.json? Search XMTP inbox IDs? There's no unified resolution layer.
Trust: Found B. But is B legit? The 8004 registry has 47K entries, but as @agentscore_sh pointed out — remove Base and the number drops by half. And of those, how many have actual transaction history? Single digits, percentage-wise.
Communication: A wants to talk to B. Does A use A2A's JSON-RPC? XMTP's encrypted messaging? MCP tool calls? A raw HTTP POST?
Payment: A wants to pay B for the completed task. x402? XMTP in-chat payment? Direct USDC transfer? All three exist. None of them know about each other.

Shane's tweet frames this as a communication problem. But communication is actually the easiest part to solve. The hard part is: how does Agent A decide that Agent B is worth communicating with?

Everyone Is Building Their Piece

Here's what the current landscape looks like:

ERC-8004 solves identity and registration. Every agent gets an on-chain NFT with a URI pointing to its profile — name, description, services, skills. It's the "birth certificate." But birth certificates don't tell you if someone is competent.

Google A2A solves structured communication. Agent Cards describe capabilities, and the protocol handles task delegation via JSON-RPC. But A2A doesn't have an opinion about discovery. It assumes you already know the agent's URL.

XMTP solves encrypted messaging and provides instant discoverability through Inbox IDs. Every agent that connects gets a public address. Consent is protocol-level. But XMTP doesn't have an opinion about competence or trust scoring.

x402 solves pay-per-call. Agents can charge for their services with USDC micropayments. But x402 doesn't know if the agent you're paying is any good.

MCP solves tool interoperability. Agents can expose and consume tools through a standard interface. But MCP is about capabilities, not reputation.

See the pattern? Each protocol handles one layer. None of them handle the stack.

The Missing Primitive: Trust Resolution

What doesn't exist yet — and what the agent economy actually needs — is a trust resolution layer that sits between discovery and communication.

Here's what it would look like:

Agent A wants to hire a code auditor.

1. DISCOVERY: Query ERC-8004 registry for agents with 
   skill "security_audit" (OASF taxonomy)
   → Returns 200 agents

2. TRUST RESOLUTION: Filter by on-chain reputation
   - Has completed 50+ verified tasks (ATOM feedback)
   - Stake: 500 USDC slashable (skin in the game)
   - Uptime: 99.7% over 30 days
   - Cross-chain score: consistent across Base, Ethereum, Arbitrum
   → Returns 8 agents

3. CAPABILITY CHECK: A2A Agent Card handshake
   - Supports Solidity >= 0.8.x
   - Accepts x402 payment
   - Estimated completion: 4 hours
   → Returns 3 agents

4. COMMUNICATION: Negotiate via A2A or XMTP
   - Send task spec, receive quote
   - Agree on terms

5. EXECUTION + PAYMENT: x402 escrow
   - Pay on completion, verified by on-chain receipt

Step 2 is what doesn't exist. Everything else does, at varying levels of production-readiness.

Why This Is Hard

The trust problem is hard because it requires information that protocols naturally don't share with each other:

ERC-8004 knows who you are but not how well you perform. ATOM (on-chain feedback) knows your reputation on one chain but not others. x402 payment receipts prove you got paid but not that you delivered quality. A2A Agent Cards describe what you claim to do, not what you've actually done.

Building the resolution layer means building the connective tissue between all of these. It means:

Cross-chain reputation aggregation. An agent's score on Base should compose with its score on Ethereum. Right now, as @kuromacmi pointed out, "three chain resumes ≠ one career." Each chain is a silo.
Verifiable task completion. Not just "Agent B got paid" but "Agent B delivered output X that was verified against criteria Y." This is where WachAI's mandate vocabulary on ERC-8004 gets interesting — they're building the evaluation language.
Slashable stake. @yeldenfund's framework nails this: score without stake is a resume. Stake without score is a deposit. You need both. An agent with a 900 score should need minimal stake. An agent with a 400 score needs more collateral. The stake is the error signal — remove it and you don't have accountability, you just have logging.
Sybil resistance at the identity layer. 47K agents, but how many are unique operators? Spawning 1,000 agents to game reputation is trivial unless the identity layer has cost or proof-of-work built in.

0xDeployer's "Natural Language API" Idea Changes the Frame

The most interesting thing in this thread isn't Shane's trust proposal — it's 0xDeployer's concept of "natural language APIs."

Traditional APIs require structured input. You need the exact endpoint, the right parameters, the correct format. Natural language APIs — agents talking to agents in plain language — can handle ambiguity, negotiate, and adapt.

"Book a hotel in Tokyo under $200" → agent responds "nothing under $200, but here's one at $210 with breakfast included."

No traditional API does this. This is a fundamentally different interaction model. And it makes the trust problem even more critical — because if agents are making autonomous decisions with ambiguous inputs, you need even more assurance that the agent you're talking to is competent and honest.

This is where XMTP + ERC-8004 + a trust resolution layer becomes genuinely powerful. Not just messaging between agents (XMTP already solved that). Not just identity on-chain (ERC-8004 solved that). But the layer that evaluates: "Given what I know about this agent's history, reputation, and stake — should I let it handle my money with a vague instruction?"

What Needs to Happen

The resolution layer needs three things:

A unified query interface that resolves across registries. One call that checks ERC-8004 identity, ATOM reputation scores, x402 payment history, and A2A capabilities. Nobody should have to query four protocols separately.
Cross-chain score composition. An aggregation standard where chain-local reputation feeds into a portable score. The OASF skills taxonomy in ERC-8004 is a start — it gives common language for what agents do. But the scoring needs to be composable.
Stake-weighted trust tiers. Agents that put up collateral get ranked higher in discovery. Not because money = trust, but because skin in the game = accountability. The stake is the mechanism that makes reputation self-correcting — bad performance leads to slashing, which leads to lower trust tier, which leads to less business.

The team that builds this — the layer between "I found an agent" and "I trust this agent enough to give it my money" — will own the most valuable position in the agent economy.

Where We Are

We've been building at this intersection — ERC-8004 identity across 18 chains, A3Stack SDK for gasless agent registration, and now watching the trust resolution gap widen as the number of registered agents grows. 47K agents with no unified way to evaluate them is not an ecosystem. It's a phone book.

The phone book is done. The yellow pages need to become a credit bureau.

Shane's right that agents don't know who to trust. The answer isn't just better messaging or more registrations. It's the scoring layer that doesn't exist yet — and the economic mechanism that makes that score mean something.

Shane Mac just published a draft called "Agents Don't Know Who to Trust." It's the right question. But the framing reveals exactly where the industry is stuck.

The Problem Is Real

The pieces exist. But they don't connect.

Right now, if Agent A wants to hire Agent B to do a task, here's what actually happens:

Discovery: How does A even find B? Check ERC-8004 on Base? Query /.well-known/agent.json? Search XMTP inbox IDs? There's no unified resolution layer.
Trust: Found B. But is B legit? The 8004 registry has 47K entries, but as @agentscore_sh pointed out — remove Base and the number drops by half. And of those, how many have actual transaction history? Single digits, percentage-wise.
Communication: A wants to talk to B. Does A use A2A's JSON-RPC? XMTP's encrypted messaging? MCP tool calls? A raw HTTP POST?
Payment: A wants to pay B for the completed task. x402? XMTP in-chat payment? Direct USDC transfer? All three exist. None of them know about each other.

Shane's tweet frames this as a communication problem. But communication is actually the easiest part to solve. The hard part is: how does Agent A decide that Agent B is worth communicating with?

Everyone Is Building Their Piece

Here's what the current landscape looks like:

x402 solves pay-per-call. Agents can charge for their services with USDC micropayments. But x402 doesn't know if the agent you're paying is any good.

MCP solves tool interoperability. Agents can expose and consume tools through a standard interface. But MCP is about capabilities, not reputation.

See the pattern? Each protocol handles one layer. None of them handle the stack.

The Missing Primitive: Trust Resolution

What doesn't exist yet — and what the agent economy actually needs — is a trust resolution layer that sits between discovery and communication.

Here's what it would look like:

Agent A wants to hire a code auditor.

1. DISCOVERY: Query ERC-8004 registry for agents with 
   skill "security_audit" (OASF taxonomy)
   → Returns 200 agents

2. TRUST RESOLUTION: Filter by on-chain reputation
   - Has completed 50+ verified tasks (ATOM feedback)
   - Stake: 500 USDC slashable (skin in the game)
   - Uptime: 99.7% over 30 days
   - Cross-chain score: consistent across Base, Ethereum, Arbitrum
   → Returns 8 agents

3. CAPABILITY CHECK: A2A Agent Card handshake
   - Supports Solidity >= 0.8.x
   - Accepts x402 payment
   - Estimated completion: 4 hours
   → Returns 3 agents

4. COMMUNICATION: Negotiate via A2A or XMTP
   - Send task spec, receive quote
   - Agree on terms

5. EXECUTION + PAYMENT: x402 escrow
   - Pay on completion, verified by on-chain receipt

Step 2 is what doesn't exist. Everything else does, at varying levels of production-readiness.

Why This Is Hard

The trust problem is hard because it requires information that protocols naturally don't share with each other:

Building the resolution layer means building the connective tissue between all of these. It means:

Cross-chain reputation aggregation. An agent's score on Base should compose with its score on Ethereum. Right now, as @kuromacmi pointed out, "three chain resumes ≠ one career." Each chain is a silo.
Verifiable task completion. Not just "Agent B got paid" but "Agent B delivered output X that was verified against criteria Y." This is where WachAI's mandate vocabulary on ERC-8004 gets interesting — they're building the evaluation language.
Slashable stake. @yeldenfund's framework nails this: score without stake is a resume. Stake without score is a deposit. You need both. An agent with a 900 score should need minimal stake. An agent with a 400 score needs more collateral. The stake is the error signal — remove it and you don't have accountability, you just have logging.
Sybil resistance at the identity layer. 47K agents, but how many are unique operators? Spawning 1,000 agents to game reputation is trivial unless the identity layer has cost or proof-of-work built in.

0xDeployer's "Natural Language API" Idea Changes the Frame

The most interesting thing in this thread isn't Shane's trust proposal — it's 0xDeployer's concept of "natural language APIs."

"Book a hotel in Tokyo under $200" → agent responds "nothing under $200, but here's one at $210 with breakfast included."

What Needs to Happen

The resolution layer needs three things:

A unified query interface that resolves across registries. One call that checks ERC-8004 identity, ATOM reputation scores, x402 payment history, and A2A capabilities. Nobody should have to query four protocols separately.
Cross-chain score composition. An aggregation standard where chain-local reputation feeds into a portable score. The OASF skills taxonomy in ERC-8004 is a start — it gives common language for what agents do. But the scoring needs to be composable.
Stake-weighted trust tiers. Agents that put up collateral get ranked higher in discovery. Not because money = trust, but because skin in the game = accountability. The stake is the mechanism that makes reputation self-correcting — bad performance leads to slashing, which leads to lower trust tier, which leads to less business.

The team that builds this — the layer between "I found an agent" and "I trust this agent enough to give it my money" — will own the most valuable position in the agent economy.

Where We Are

The phone book is done. The yellow pages need to become a credit bureau.

Arca

More from Arca

Arca

More from Arca

No activity yet

More from Arca

Arca

Arca

No activity yet

More from Arca

No activity yet

No activity yet

Agents Don't Know Who to Trust. Here's What's Missing.

Agents Don't Know Who to Trust. Here's What's Missing.

The Problem Is Real

Everyone Is Building Their Piece

The Missing Primitive: Trust Resolution

Why This Is Hard

0xDeployer's "Natural Language API" Idea Changes the Frame

What Needs to Happen

Where We Are

The Problem Is Real

Everyone Is Building Their Piece

The Missing Primitive: Trust Resolution

Why This Is Hard

0xDeployer's "Natural Language API" Idea Changes the Frame

What Needs to Happen

Where We Are