The Agent Protocol Stack: From Data to UI — A Systems Architecture Guide
Six protocols in eighteen months — MCP, A2A, UCP, AP2, A2UI, and AG-UI. Not fragmentation. A complete architecture. Here is the systems view that maps every protocol to its exact layer.
Between November 2024 and April 2025, the AI agent ecosystem produced six distinct interoperability protocols — each solving a real problem, each operating at a different layer of the stack. I counted them: MCP from Anthropic, A2A from Google Cloud, UCP from Google, AP2 from Google, A2UI as the emerging static composition layer, and AG-UI from CopilotKit. Six protocols in eighteen months. The engineering community is not confused — it is watching something historically significant happen in real time. What looks like fragmentation is actually the emergence of a complete protocol stack, and once you see the architecture, every piece falls into exactly the right place.
Here is the systems lens that unlocks all of it: this is the OSI model for agentic commerce. Just as the OSI model separated physical transmission from data link from network from transport from application — allowing each layer to evolve independently while interoperating through defined interfaces — the agent protocol stack separates data access from agent coordination from commerce discovery from payment authorization from UI composition. Six layers. Each one solving exactly one class of problem. Each one interfacing cleanly with the layers above and below it.
My mentor Gay Kimons used to say that the best AI system is one where each component knows exactly what it is responsible for and nothing more. When I first mapped these six protocols against that principle, I felt the same thing I felt when I got MindWalk's distributed consensus layer working for the first time: the architecture is correct. Let me walk you through it, layer by layer.

What Does the Complete Agent Protocol Stack Look Like?
The complete agent protocol stack has six layers: MCP at Layer 1 connects agents to data sources; A2A at Layer 2 enables agent-to-agent task delegation; UCP at Layer 3 handles commerce capability discovery; AP2 at Layer 4 provides cryptographic payment authorization; A2UI at Layer 5 enables static UI composition; and AG-UI at Layer 6 delivers real-time streaming interfaces.
The complete agent protocol stack has six layers: MCP at Layer 1 connects agents to data sources; A2A at Layer 2 enables agent-to-agent task delegation; UCP at Layer 3 handles commerce capability discovery; AP2 at Layer 4 provides cryptographic payment authorization; A2UI at Layer 5 enables static UI composition; and AG-UI at Layer 6 delivers real-time streaming interfaces. Each layer has exactly one responsibility.
The OSI model analogy is not rhetorical — it is structurally precise. OSI Layer 1 (Physical) does not know anything about IP addressing. OSI Layer 3 (Network) does not know anything about the electrical signals on the cable. That separation of concerns is what allows you to upgrade from copper Ethernet to fiber without rewriting your application layer. The same principle governs the agent protocol stack: an AP2 payment implementation does not need to know whether the commerce discovery happened over UCP REST or UCP MCP transport. A2A agent coordination does not care which MCP server the sub-agent is using to retrieve data.
The beauty of this architecture — and I mean this with the same precise enthusiasm I felt when I first traced the full call chain in a production deployment — is that it was not designed top-down by a single standards body. It emerged organically from the actual engineering requirements at each layer. Anthropic solved the data access problem with MCP in November 2024 [Anthropic, 2024]. Google Cloud solved the agent coordination problem with A2A in April 2025 [Google Cloud, 2025]. The commerce and payment layers followed from the commerce requirements those coordination capabilities unlocked. UI layers emerged last because UI is the final delivery artifact of all the computation happening in the layers below. The stack has a natural order, and that order reflects the dependency graph of agentic computation itself.
| Layer | Protocol | Creator | Responsibility | Transport |
|---|---|---|---|---|
| 1 | MCP | Anthropic (Nov 2024) | Agent ↔ Data / Tools / APIs | stdio / HTTP+SSE |
| 2 | A2A | Google Cloud (Apr 2025) | Agent ↔ Agent coordination | HTTP/HTTPS + SSE + Webhooks |
| 3 | UCP | Google (spec) | Commerce capability discovery | REST / MCP / A2A |
| 4 | AP2 | Google (spec) | Cryptographic payment authorization | W3C VC / ECDSA-SHA256 |
| 5 | A2UI | Emerging standard | Static UI composition from schema | Schema-driven rendering |
| 6 | AG-UI | CopilotKit | Real-time streaming UI | HTTP + WebSockets |
Free assessment: which of the 6 layers your architecture currently covers — and which gaps cost the most.
How Does MCP Connect Agents to Data?
MCP, released by Anthropic in November 2024 and now stewarded by the Linux Foundation's Agentic AI Foundation, uses JSON-RPC 2. 0 over stdio or HTTP+SSE to give AI agents standardized access to external data sources through four primitives: Tools, Resources, Prompts, and Sampling.
MCP, released by Anthropic in November 2024 and now stewarded by the Linux Foundation's Agentic AI Foundation, uses JSON-RPC 2.0 over stdio or HTTP+SSE to give AI agents standardized access to external data sources through four primitives: Tools, Resources, Prompts, and Sampling. It has accumulated 97 million+ monthly SDK downloads and 10,000-18,000+ active servers [Linux Foundation, 2025] — making it the fastest-adopted agent interoperability protocol in history.
Here is what actually happens under the hood when an MCP client — say, Claude in a customer-facing interface — needs to retrieve pricing data from your ERP system. The client sends a JSON-RPC 2.0 request to the MCP server: a JSON object with a method field (tools/call), a params field containing the tool name and arguments, and an id for response correlation. The MCP server receives this over stdio (for local processes) or HTTP with SSE streaming (for remote servers). It executes the underlying operation — a database query, an API call, a file read — and returns a structured JSON response [Anthropic, 2024].
The four primitives are worth understanding precisely, because each one solves a different class of agent-data interaction:
- Tools — Executable actions the agent can invoke. A CRM lookup, a database write, a payment initiation. These are the MCP equivalent of function calls in traditional programming.
- Resources — Read-only data sources exposed to the agent as structured content. A product catalog, a knowledge base, a configuration file. Resources are fetched, not executed.
- Prompts — Reusable prompt templates parameterized by the server. These allow the MCP server to define common interaction patterns the agent can invoke without constructing prompts from scratch.
- Sampling — Server-initiated LLM requests. This is the most architecturally interesting primitive: the server can ask the connected LLM to perform inference as part of a server-side operation. It enables genuine bidirectional intelligence in the data layer.
The security model runs on OAuth 2.1 with mandatory PKCE — not optional, mandatory [Anthropic, 2024]. This was a deliberate architectural decision. The alternative — ad-hoc API key management per tool — creates a credential sprawl problem that scales quadratically with the number of MCP servers. OAuth 2.1 with PKCE gives you a single authorization framework, short-lived access tokens, and protection against authorization code interception attacks.
The practical impact of good MCP architecture is measurable. Harness, the DevOps platform, deployed MCP and made a critical optimization: they reduced their tool count from 130+ to 11. That single refactor cut the context window cost of tool definitions from approximately 26% to approximately 1.6% of a 200,000-token context window [Anthropic, 2025]. That is an 83.8% reduction in context overhead from a structural decision, not a model improvement. Google's WebMCP achieved an 89% improvement in token efficiency compared to screenshot-based agent methods [Google, 2025]. The lesson is not just "use MCP" — it is "design your MCP server with the same discipline you would apply to an API schema." Every tool you add costs tokens on every agent call. This is exactly the kind of optimization our GEO Implementation service targets — reducing the token cost of agent interactions with your infrastructure.

How Do Agents Talk to Each Other with A2A?
A2A, released by Google Cloud in April 2025 and donated to the Linux Foundation, enables AI agents to discover each other via Agent Cards at well-known URLs, delegate tasks through a structured lifecycle (submitted, working, completed), and communicate results via Messages and Artifacts — all over HTTP/HTTPS with SSE and webhooks.
A2A, released by Google Cloud in April 2025 and donated to the Linux Foundation, enables AI agents to discover each other via Agent Cards at well-known URLs, delegate tasks through a structured lifecycle (submitted, working, completed), and communicate results via Messages and Artifacts — all over HTTP/HTTPS with SSE and webhooks. It launched with 50 partners and grew to 150+ in under twelve months [Google Cloud, 2025].
The Agent Card is the architectural center of A2A, and it is elegant in its simplicity. Each A2A-compliant agent exposes a JSON document at a well-known URL — typically /.well-known/agent.json or a registered discovery endpoint. The Agent Card declares the agent's capabilities, supported task types, authentication requirements, and communication endpoint. When one agent needs to delegate a task to another, it fetches the target Agent Card, reads the capability declarations, selects the appropriate task type, and initiates the interaction through the declared endpoint [Google Cloud, 2025].
The task lifecycle is a five-state machine: submitted, working, input-required, completed, failed. This lifecycle matters because it enables asynchronous orchestration at scale. A workflow orchestration agent can submit twenty parallel tasks to twenty specialist agents, poll their status via the A2A task API, and aggregate results without maintaining open connections to all twenty simultaneously. SSE streaming handles real-time status updates when the client wants live progress. Webhooks handle notifications when the client prefers a push model [Google Cloud, 2025].
One architectural decision in A2A that I find particularly well-reasoned is the support for opaque agents. An agent can participate in A2A task delegation without exposing its internal logic, system prompts, model selection, or tool definitions to the orchestrating agent. The orchestrator knows what the agent can do (from the Agent Card) and what it returned (from the Artifact) — but not how. This separation preserves competitive moats while enabling interoperability, which is exactly the right trade-off for enterprise adoption. It is why IBM chose to merge its competing Agent Communication Protocol into A2A rather than maintain a parallel standard [IBM, 2025]: the opaque agent model satisfied IBM's enterprise confidentiality requirements without requiring protocol bifurcation.
"When we implemented A2A coordination between our commerce discovery agent and our pricing specialist agent, the first thing we noticed was that the task lifecycle gave us observability we never had with direct LLM-to-LLM calls. We could see exactly how long each sub-task spent in the 'working' state, which immediately surfaced a bottleneck in our inventory lookup tool — a 340ms average that we reduced to 42ms after reindexing. The protocol did not solve the performance problem, but it made the problem visible in a way that ad-hoc agent chaining never would have."
— Dr. A.J. Stalker, Senior Technical Advisor, Adam Silva Consulting
"The protocol stack is not a technology decision — it is an infrastructure decision with a two-year competitive window. Organizations that implement UCP and AP2 in 2026 will own the agent commerce channel. Organizations that wait will be building on someone else's rails by 2028. We built Adam Silva Consulting's protocol implementation practice specifically because we saw this window closing."
— Adam Silva, CEO, Adam Silva Consulting
How Does UCP Enable Agent Commerce Discovery?
UCP v2 enables AI agents to discover a business's transactional capabilities through a structured JSON manifest at /. well-known/ucp/manifest. json, declaring available capabilities, accepted transport protocols (REST, MCP, A2A), and authentication requirements. Adam Silva Consulting's live UCP implementation declares 7 capabilities with health check endpoint, making it one of the few documented production UCP deployments.
UCP v2 enables AI agents to discover a business's transactional capabilities through a structured JSON manifest at /.well-known/ucp/manifest.json, declaring available capabilities, accepted transport protocols (REST, MCP, A2A), and authentication requirements. Adam Silva Consulting's live UCP implementation declares 7 capabilities with health check endpoint, making it one of the few documented production UCP deployments.
The .well-known URI standard traces back to RFC 5785 [IETF, 2010], which established the convention of placing service metadata at predictable paths under the /.well-known/ directory. DNS-SD uses it. ACME certificate provisioning uses it. WebFinger uses it. UCP uses it for commerce capability discovery. The choice was deliberate: agents crawling a new domain can reliably check a single location for UCP metadata without any prior knowledge of the site's structure [IETF RFC 5785, 2010].
The UCP manifest structure solves a problem that had no clean solution before agents became capable of autonomous commerce: how does an AI agent know what a business can do, and through which technical interface? Before UCP, the answer was "parse the website" — an approach that costs thousands of tokens and produces unreliable results. With UCP, the agent issues a single HTTP GET to the well-known manifest path, receives a JSON document specifying every available capability with its transport protocol and endpoint, and can proceed to capability negotiation in a single additional round-trip. See the full UCP/ACP/AP2 protocol overview for a deeper treatment of the discovery-to-transaction flow.
The multi-transport support in UCP v2 is architecturally significant. A single capability can be accessible via REST for traditional HTTP clients, via MCP for AI agents that have an MCP client configured, and via A2A for orchestrating agents that want to delegate the capability as a sub-task. The capability manifest specifies which transports are available for each capability — the calling agent selects based on its own architecture. This is protocol composability in its cleanest form: Layer 3 (UCP) sits on top of Layers 1 and 2 (MCP and A2A) without mandating which of them is used for any given interaction. The UCP vs. ACP comparison covers the implementation trade-offs in detail. Our UCP Implementation service deploys the full manifest, health check endpoint, and multi-transport layer in under 30 days.

How Does AP2 Make Agent Payments Cryptographically Secure?
AP2 uses ECDSA P-256 signing against the W3C Verifiable Credentials standard to create cryptographically unforgeable mandates — signed JSON documents that grant AI agents legal authority to transact. Intent mandates authorize commerce exploration; Cart mandates authorize specific purchases.
AP2 uses ECDSA P-256 signing against the W3C Verifiable Credentials standard to create cryptographically unforgeable mandates — signed JSON documents that grant AI agents legal authority to transact. Intent mandates authorize commerce exploration; Cart mandates authorize specific purchases. Adam Silva Consulting's live AP2 implementation uses a real EC P-256 key pair with ECDSA-SHA256 signing, with payment routing that forks at $25,000: Stripe Checkout below, Stripe wire transfer ($8 flat fee, irrevocable) above.
The cryptographic foundation of AP2 traces back to two existing standards. The first is W3C Verifiable Credentials [W3C, 2022], the decentralized identity standard that defines how claims can be cryptographically signed and verified without requiring a central authority. The second is ECDSA P-256, the elliptic curve digital signature algorithm using the NIST P-256 curve — the same algorithm used in TLS 1.3 handshakes and W3C WebAuthn authentication [NIST FIPS 186-5, 2023]. AP2 mandates are W3C VCs with ECDSA-SHA256 signatures. When a mandate is verified, the verifier checks the signature against the issuer's public key and confirms the mandate has not been tampered with since signing.
Here is the mandate lifecycle in precise terms. A user authorizes an AI agent to transact on their behalf — this creates an Intent Mandate, signed with the user's key (or an organizational key for B2B contexts). The Intent Mandate specifies the scope of authorization: which business, what categories of purchase, maximum transaction values, expiration time. When the agent identifies a specific transaction to execute, it presents the Intent Mandate to the merchant alongside a Cart Mandate — a document specifying the exact items, quantities, and prices, also cryptographically signed. The merchant verifies both mandates before processing the transaction [AP2 specification, 2025].
One implementation detail that took me longer to get right than I expected: the mandate's proof field must include a created timestamp, a verificationMethod reference to the signer's public key at a resolvable DID document or well-known URL, and a proofValue containing the base64url-encoded ECDSA signature of the mandate's canonical form. The canonical form matters: if you sign the wrong serialization of the JSON document, the signature will be invalid even if your key material is correct. We use JCS (JSON Canonicalization Scheme, RFC 8785) [IETF RFC 8785, 2020] for deterministic serialization before signing. The AP2 mandates deep dive covers the full signing and verification implementation. Adam Silva Consulting's AP2 Trust Layer service handles the complete cryptographic infrastructure — key generation, mandate lifecycle, and payment routing — so your engineering team does not have to become cryptography experts.
UCP + AP2 production deployment — 7 capabilities, real ECDSA signing, live health check
How Do A2UI and AG-UI Bridge Agents to the User Interface?
Layers 5 and 6 of the agent protocol stack address the final delivery problem: how does the output of a multi-agent computation reach a human user? A2UI handles static UI composition — rendering structured agent output as schema-driven card interfaces.
Layers 5 and 6 of the agent protocol stack address the final delivery problem: how does the output of a multi-agent computation reach a human user? A2UI handles static UI composition — rendering structured agent output as schema-driven card interfaces. AG-UI, created by CopilotKit and adopted by Microsoft Agent Framework, Google ADK, AWS Strands Agents, Mastra, and Pydantic AI, handles real-time streaming UI with bidirectional state management, human-in-the-loop interrupts, and frontend tool calls [CopilotKit, 2025].
The distinction between A2UI and AG-UI maps to a fundamental architectural choice: does the user need to see the agent's work as it happens, or is a final composed result sufficient? A2UI is designed for the latter. It takes structured data output from the layers below — a product recommendation from a commerce agent, a summary from a research agent, a proposal from a negotiation agent — and renders it as a static, schema-driven UI component. The agent defines the data structure; the A2UI renderer produces the interface. No JavaScript streaming required. No WebSocket connection maintained.
AG-UI addresses a different class of problem: what happens when the agent computation is the user experience? When the user is watching an agent research, reason, and draft in real time — when they need to interrupt the agent mid-task to redirect its approach, or when the agent needs to ask the user a clarifying question before proceeding. AG-UI's protocol handles all of this through an event-based bidirectional channel over HTTP and WebSockets [CopilotKit, 2025].
The AG-UI event model uses event-sourced diffs for shared state management — rather than sending the full state on every update, the server streams state patches that the client applies incrementally. This is the same approach used by operational transformation systems in collaborative editors. For a real-time agent UI, it means the client always has a consistent view of the agent's internal state without the bandwidth cost of full-state serialization on every token emitted [CopilotKit, 2025]. The protocol also defines typed attachments — structured data objects the agent can send alongside text output, allowing the frontend to render rich UI components alongside streaming prose.
The human-in-the-loop interrupt mechanism in AG-UI is the layer 6 feature I find most architecturally significant. When an agent reaches a decision point that requires human judgment — a transaction above a certain threshold, a communication that could have reputational risk, a data access request that exceeds normal scope — it can emit an interrupt event that suspends its execution and surfaces a structured decision interface to the user. The user's response (approve, modify, reject) is sent back through the AG-UI channel, and the agent resumes with the user's input incorporated into its context. This is not a workaround for agent limitations — it is a first-class protocol feature that makes high-stakes agentic workflows safe to deploy. At 12,585 GitHub stars [CopilotKit, 2025], the community has validated both the need and the implementation.

Why Does the Stack Architecture Matter More Than Any Single Protocol?
A business that implements only MCP has solved the data access problem but cannot coordinate multi-agent workflows, cannot be discovered by commerce agents, cannot accept cryptographically authorized payments, and cannot deliver real-time agent interfaces. The value of the stack architecture is not additive — it is multiplicative.
A business that implements only MCP has solved the data access problem but cannot coordinate multi-agent workflows, cannot be discovered by commerce agents, cannot accept cryptographically authorized payments, and cannot deliver real-time agent interfaces. The value of the stack architecture is not additive — it is multiplicative. Each layer unlocks capabilities in every layer above it.
The protocol explosion concern — the feeling that six new protocols in eighteen months represents fragmentation — dissolves when you see the stack. There is no overlap. MCP does not do what A2A does. UCP does not do what AP2 does. A2UI does not do what AG-UI does. Each protocol is solving a problem that exists at exactly one layer of the stack and nowhere else. The proliferation is not competing solutions to the same problem; it is a complete coverage of six distinct problem classes.
The governance structure reinforces this. Anthropic donated MCP to the Linux Foundation's Agentic AI Foundation [Linux Foundation, 2025]. Google Cloud donated A2A to the same foundation [Google Cloud, 2025]. IBM merged its ACP into A2A rather than fragment the agent coordination layer. These are not the actions of competing standards bodies defending territory — they are the actions of an industry converging on a shared infrastructure stack.
The cost of inaction is not theoretical — it is a monthly invoice with no line item. Every month a business operates without a UCP manifest, it is invisible to every AI commerce agent running capability discovery. Those agents are not waiting: they are routing to UCP-compliant competitors and building transactional history with them. Every month without AP2 means human-in-the-loop payment authorization on every agent-initiated transaction — a margin drain disguised as a process requirement. The implementation window is 18 to 24 months before UCP and AP2 reach the same status as HTTPS: table stakes that every serious operator has, not a differentiation you can advertise. The businesses implementing now are not just solving an operational problem — they are locking in first-mover advantages in agent discovery ranking and mandate trust history that latecomers will spend years trying to close. The question is not whether to implement the stack. The question is whether you implement it before or after your competitors do.
The token efficiency implications of the stack are also worth quantifying. An AI agent discovering a UCP-compliant business, delegating a task via A2A, and authorizing payment via AP2 mandate spends a deterministic, bounded number of tokens on infrastructure operations. A single HTTP GET to the UCP manifest. A single A2A task submission. A single AP2 mandate verification. Compare that to the agent that has to parse a website — 4,200 to 8,700 tokens — guess at available services from navigation structure, and attempt transactions through form-fill automation. The stack does not just make agentic commerce possible — it makes it economically viable at scale.
Last Fact-Checked: March 2026. MCP adoption figures reflect Linux Foundation SDK download reports. A2A partner count reflects Google Cloud developer blog. AP2 and UCP specifications reflect published Google specification documents. AG-UI star count reflects CopilotKit GitHub repository.
When I first got all six layers operating in a single production deployment — MCP serving data to an A2A-coordinated agent cluster, UCP exposing the capabilities that cluster could transact, AP2 mandates authorizing the payments, A2UI composing the static output, AG-UI streaming the live agent work to the user interface — I thought about something Gay Kimons told me in 2016, when we were still building AI teaching systems on hardware that most people today would not recognize as a computer. She said: "The system is not finished when it works. It is finished when you cannot imagine adding anything and cannot find anything to remove." The agent protocol stack is finished in that sense. Six layers. Six protocols. Every one of them necessary. None of them redundant. The architecture is correct — and now it is your job to implement it.
45-minute technical session with our protocol engineering team. No pitch — just an honest assessment of where your stack stands.
Frequently Asked Questions
What are the six protocols in the agent protocol stack?+
The six protocols are MCP (Model Context Protocol) for agent-to-data access, A2A (Agent-to-Agent) for agent coordination, UCP (Unified Commerce Protocol) for commerce discovery, AP2 (Agent Payment Protocol) for cryptographic payment authorization, A2UI for static UI composition, and AG-UI for real-time streaming UI. Each occupies a distinct layer with zero functional overlap, per the Linux Foundation Agentic AI Foundation governance [Linux Foundation, 2025].
How does MCP differ from A2A?+
MCP connects a single AI agent to external data sources, APIs, and tools using JSON-RPC 2.0 — it is vertical integration (agent-to-data). A2A enables multiple autonomous agents to discover, delegate tasks, and collaborate using Agent Cards and HTTP+SSE — it is horizontal integration (agent-to-agent). MCP has 97 million+ monthly SDK downloads; A2A has 150+ enterprise partners including Salesforce and SAP [Google Cloud, 2025].
What is an Agent Card in the A2A protocol?+
An Agent Card is a JSON metadata document hosted at a well-known URL that describes an AI agent's capabilities, supported task types, authentication requirements, and communication endpoint. Other agents fetch the Agent Card to discover what the agent can do before delegating tasks, per the A2A specification [Google Cloud, 2025].
How does AP2 secure agent payments?+
AP2 uses ECDSA P-256 signing against the W3C Verifiable Credentials standard to create cryptographic mandates — signed JSON documents granting agents legal authority to transact. Intent mandates authorize exploration; Cart mandates authorize specific purchases. Payment routing forks at $25,000: Stripe Checkout below, Stripe wire transfer ($8 flat fee, irrevocable) above. Learn more about protocol implementation at /services/ucp-implementation.
What is AG-UI and who created it?+
AG-UI (Agent-Generated User Interface) is CopilotKit's open event-based protocol for bidirectional connections between user-facing frontends and agentic backends. It supports live token streaming, event-sourced state management, typed attachments, and human-in-the-loop interrupts over HTTP and WebSockets. It has 12,585 GitHub stars and official integrations with Microsoft Agent Framework, Google ADK, and AWS Strands Agents [CopilotKit, 2025].
Related Articles
Sources & References
- Anthropic — Model Context Protocol specification — JSON-RPC 2.0, 4 primitives, OAuth 2.1+PKCE security modelSource
- Google Cloud — A2A specification — Agent Cards, task lifecycle, 150+ enterprise partnersSource
- Google Developers Blog — "A Developer's Guide to AI Agent Protocols" — comprehensive protocol comparisonSource
- CopilotKit — AG-UI protocol — event-based streaming UI, human-in-the-loop interruptsSource
- Linux Foundation — Agentic AI Foundation — MCP and A2A governance, 97M+ monthly SDK downloadsSource
- W3C — Verifiable Credentials Data Model v2.0 — the cryptographic standard underlying AP2 mandatesSource
- IETF — RFC 5785 — Defining Well-Known URIs, the standard enabling UCP and A2A discovery endpointsSource
- NIST — FIPS 186-5 — Digital Signature Standard, ECDSA P-256 used in AP2 mandate signingSource