Model Context Protocol: How One JSON-RPC Standard Collapsed the M×N Integration Problem

In November 2024 Anthropic published a specification almost no one asked for and within a year nearly every major AI lab shipped support for it. The Model Context Protocol did not introduce a new model, a faster inference kernel, or a smarter agent loop. It introduced a wire format: a way for a language model running inside one application to call a tool living in a completely separate process, written by a completely separate team, without either side knowing anything about the other in advance. The interesting part is not the JSON. It is what the JSON eliminated.

Every team building agents in 2024 hit the same wall. An agent that needs to read your GitHub issues, query your Postgres database, and post to Slack needs three integrations. Build a second agent and you need three more, because the first agent's GitHub glue was wired into its own prompt format and its own runtime. The cost of connecting agents to capabilities grew as the product of two numbers, not their sum. MCP is the standard that turned that product back into a sum, and most of its design decisions, including the ones that later caused security headaches, follow from that single goal.

Why this matters: The hard problem in agentic AI is rarely the model. It is wiring a fast-moving model to a slow-moving world of databases, SaaS APIs, and internal tools, safely, repeatedly, and without rewriting the glue every quarter. A protocol is how that wiring stops being bespoke.

TL;DR

MCP is an open, JSON-RPC 2.0 protocol that standardizes how an AI application (the host) connects to external tools and data (servers), turning an M×N integration explosion into an M+N problem.
The architecture is deliberately three-tier: a host owns the model and the conversation, clients maintain one isolated session each, and servers expose tools, resources, and prompts. Servers cannot see the full conversation or each other, by design.
Anthropic introduced MCP in November 2024; by December 2025 there were more than 10,000 active public MCP servers and adoption across ChatGPT, Gemini, Copilot, Cursor, and VS Code.
The protocol matured fast through dated revisions: stdio plus Streamable HTTP transport, then an OAuth 2.1 authorization framework binding tokens to a specific server via RFC 8707 resource indicators.
The same openness that makes servers trivial to publish also makes tool poisoning and prompt injection structural risks. A 2025 benchmark found over 85% of catalogued attacks compromised at least one major MCP platform.
In December 2025 Anthropic donated MCP to the Linux Foundation's new Agentic AI Foundation, moving it from a single vendor's project to neutral governance.

At a Glance

flowchart LR
  User[User] --> Host
  subgraph HostBox[AI Application]
    Host[Host runtime] --> LLM[Language model]
    Host --> C1[Client 1]
    Host --> C2[Client 2]
  end
  C1 -->|JSON-RPC session| S1[GitHub server]
  C2 -->|JSON-RPC session| S2[Postgres server]
  S1 --> GH[(GitHub API)]
  S2 --> DB[(Database)]
  class User,GH,DB blue
  class Host,LLM,C1,C2 purple
  class S1,S2 teal
  classDef blue fill:#1e40af,stroke:#3b82f6,stroke-width:1px,color:#fff
  classDef purple fill:#6d28d9,stroke:#a78bfa,stroke-width:1px,color:#fff
  classDef teal fill:#0e7490,stroke:#22d3ee,stroke-width:1px,color:#fff

The host owns the model and the user. Each external capability gets its own client and its own server, connected by an isolated session. Add a new tool by adding a server, not by editing the host.

[IMAGE: Side-by-side schematic. Left panel "Before MCP" shows 4 agents each drawing 5 separate colored lines to 5 tools (20 bespoke adapters). Right panel "With MCP" shows the same 4 agents connecting to a single protocol bus that fans out to 5 servers. Annotate the line counts: 20 versus 9.]

Before a Protocol: The M×N Wall

To see why MCP exists, count the adapters. Suppose a company runs \(M\) distinct AI applications (a coding assistant, a support bot, an internal research agent) and wants each to reach \(N\) data sources and tools. Without a shared protocol, every application implements its own client logic for every tool: authentication, request shaping, response parsing, error handling. The number of integrations to build and maintain is:

\[\text{Integrations} = M \times N\]

Five applications times twenty tools is one hundred separate pieces of glue, each subtly different, each breaking when an upstream API changes. With a shared protocol, each application implements the protocol once and each tool implements the protocol once. The cost becomes:

\[\text{Integrations} = M + N\]

Five plus twenty is twenty-five. The same idea rescued the field once before. Before Microsoft's Language Server Protocol in 2016, every code editor needed a custom integration for every programming language, and the editor-times-language matrix was the reason good tooling never reached niche languages. LSP let an editor speak one protocol and a language implement one server. MCP is explicitly the LSP move applied to AI applications and tools, a lineage its own designers cite.

The wall was not hypothetical. By mid-2024, function calling existed in the OpenAI and Anthropic APIs, but a "tool" was just a JSON schema you pasted into a request, plus whatever code you wrote to actually execute the call. That execution code was the integration, and it lived inside each application. Nothing was reusable across applications, runtimes, or vendors. A Slack tool written for one agent framework could not be dropped into another without a rewrite.

timeline
  title Evolution toward a tool protocol
  2016 : Language Server Protocol decouples editors from languages
  2023 : LLM function calling and the ChatGPT plugin era
  2024 : Anthropic publishes MCP in November
  2025 : OpenAI, Google, and Microsoft adopt MCP. Streamable HTTP and OAuth 2.1 land
  2025 : MCP donated to the Linux Foundation Agentic AI Foundation in December

[IMAGE: Annotated growth curve. X-axis time from Nov 2024 to Dec 2025, Y-axis count of public MCP servers on a log scale, with a labeled marker at "10,000+ servers, Dec 2025" and callout pins for the OpenAI and Google adoption announcements in spring 2025.]

How MCP Actually Works

MCP is a stateful, bidirectional protocol built on JSON-RPC 2.0, the same lightweight remote-procedure-call convention used by the Language Server Protocol and Ethereum nodes (JSON-RPC 2.0 Specification). Every message is a JSON object that is either a request (it has an id and expects a response), a response, or a notification (no id, no reply). On top of that thin substrate, MCP layers a session: an initialization handshake, capability negotiation, and then a long-lived exchange of tool calls, resource reads, and notifications.

The three roles

The official architecture is a client-host-server triangle, and the separation is load-bearing rather than cosmetic (MCP Architecture, 2025-11-25).

The host is the application the user actually touches: Claude Desktop, an IDE like Cursor, or a custom agent. It owns the language model, holds the full conversation, creates client instances, and enforces consent. Crucially, it is the only component that sees everything.

A client is a connector the host spins up, one per server, maintaining a single stateful session. Clients route messages, negotiate capabilities, and keep each server walled off from the others.

A server exposes capabilities and nothing else. It might wrap the GitHub API, a local filesystem, or a Postgres database. It can be a subprocess on the same machine or a remote web service. The design principles are explicit that a server should be "extremely easy to build" and that a server must not be able to read the whole conversation nor see into other servers. That isolation is the security model's foundation, and it is also why later attacks that smuggle instructions through tool descriptions were so effective: the host trusts the server's metadata.

flowchart TD
  subgraph Host[Host process]
    Coord[Coordinator] --> Consent[Consent and policy]
    Coord --> Context[Conversation context]
  end
  Coord --> ClientA[Client A]
  Coord --> ClientB[Client B]
  ClientA -->|stdio| ServerA[Filesystem server]
  ClientB -->|Streamable HTTP| ServerB[Remote SaaS server]
  ServerA --> Local[(Local files)]
  ServerB --> Cloud[(SaaS API)]
  class Coord,Consent,Context,ClientA,ClientB purple
  class ServerA,ServerB teal
  class Local,Cloud blue
  classDef purple fill:#6d28d9,stroke:#a78bfa,stroke-width:1px,color:#fff
  classDef teal fill:#0e7490,stroke:#22d3ee,stroke-width:1px,color:#fff
  classDef blue fill:#1e40af,stroke:#3b82f6,stroke-width:1px,color:#fff

[IMAGE: Labeled triangle schematic of the client-host-server architecture. Host at the apex annotated "sees everything, owns model and consent," two client nodes mid-level annotated "1:1, isolated session," two server nodes at the base annotated "sees only its own context." Draw the isolation walls between servers as dashed barriers.]

The three primitives

Servers expose capability through three server-side primitives, and the distinction between them is the part engineers most often get wrong.

Tools are model-controlled actions. They are functions the model can decide to invoke, each described by a name, a human-and-model-readable description, and a JSON Schema for its arguments. Calling a tool can have side effects: send an email, run a query, create a file. Because the model chooses when to call them based on their descriptions, tool metadata is part of the model's effective prompt.

Resources are application-controlled context. A resource is addressable read-only data, identified by a URI, that the host can pull into the model's context: a file's contents, a database schema, a log. Resources are meant to be selected by the application or user, not invoked autonomously by the model.

Prompts are user-controlled templates. A prompt is a reusable, parameterized message template a server offers, surfaced to the user as something like a slash command. It packages a known-good interaction so users do not have to hand-craft it.

There is a symmetric set of client-side primitives that let the relationship run both ways. Sampling lets a server ask the host's model to generate a completion, so a server can use intelligence without shipping its own model. Roots let a client tell a server which filesystem or URI boundaries it may operate within. Elicitation, added in the 2025-06-18 revision, lets a server pause and request structured input from the user mid-operation, for example asking for a confirmation or a missing parameter through a host-rendered form.

[IMAGE: A 2x3 labeled grid "anatomy" figure. Top row server primitives (Tools = model-controlled, Resources = app-controlled, Prompts = user-controlled). Bottom row client primitives (Sampling, Roots, Elicitation). Each cell annotated with who controls it and one concrete example.]

The lifecycle

A session begins with negotiation. The client sends an initialize request carrying its protocol version and capabilities; the server replies with its own. From that point each side knows what the other supports, so a server that offers resource subscriptions only advertises them if the client can handle the notifications. This capability handshake is what lets the protocol add features without breaking older peers.

sequenceDiagram
  participant U as User
  participant H as Host plus Model
  participant C as Client
  participant S as Server
  U->>H: Ask a question
  H->>C: Start session
  C->>S: initialize (version, capabilities)
  S-->>C: capabilities (tools, resources)
  C->>S: tools/list
  S-->>C: tool definitions
  H->>H: Model selects a tool
  C->>S: tools/call (name, arguments)
  S-->>C: result content
  H-->>U: Answer grounded in result

The model never speaks JSON-RPC directly. The host translates: it lists the server's tools, formats them into the model's native tool-calling schema, lets the model choose, then converts the model's chosen call back into a tools/call request. MCP standardizes the host-to-server hop. The host-to-model hop stays whatever each vendor already uses.

Seeing It in Motion

Two views clarify what the prose cannot. The first is the decision flow when a model encounters a task that needs an external capability.

flowchart TD
  Start[User request] --> Need{Needs external<br/>capability}
  Need -->|No| Answer[Answer directly]
  Need -->|Yes| Discover[Host lists server tools]
  Discover --> Choose[Model picks tool and args]
  Choose --> Gate{Host consent<br/>policy}
  Gate -->|Denied| Refuse[Block and inform user]
  Gate -->|Allowed| Invoke[Client sends tools/call]
  Invoke --> Exec[Server executes]
  Exec --> Return[Result into context]
  Return --> Answer
  class Start,Answer blue
  class Discover,Choose,Invoke,Exec,Return purple
  class Gate amber
  class Refuse rose
  classDef blue fill:#1e40af,stroke:#3b82f6,stroke-width:1px,color:#fff
  classDef purple fill:#6d28d9,stroke:#a78bfa,stroke-width:1px,color:#fff
  classDef amber fill:#b45309,stroke:#fbbf24,stroke-width:1px,color:#fff
  classDef rose fill:#be123c,stroke:#fb7185,stroke-width:1px,color:#fff

The consent gate is not a footnote. The specification requires the host to be the policy enforcement point, because the model's tool choice is influenced by text the server controls. The second view is the session lifecycle as a state machine, which makes the role of negotiation visible.

stateDiagram-v2
  [*] --> Connecting
  Connecting --> Initializing: transport open
  Initializing --> Ready: capabilities exchanged
  Ready --> Active: tool or resource call
  Active --> Ready: result delivered
  Ready --> Closing: host ends session
  Active --> Closing: error or shutdown
  Closing --> [*]

By the Numbers

MCP's quantitative story is partly about integration arithmetic and partly about how fast the ecosystem and the spec moved. The figures below come from the official specification, Anthropic's announcements, and a peer-reviewed security benchmark; vendor adoption counts are self-reported and should be read as approximate.

Metric	Value	Source
Integration cost, no protocol	\(O(M \times N)\)	Protocol design rationale
Integration cost, with MCP	\(O(M + N)\)	Protocol design rationale
Transport layer	JSON-RPC 2.0 over stdio or Streamable HTTP	Spec 2025-11-25
Server primitives	3 (tools, resources, prompts)	MCP specification
Active public servers, Dec 2025	More than 10,000	Anthropic, Dec 2025
Catalogued attacks compromising a major platform	Over 85%	MCPSecBench, arXiv:2508.13220
Average effectiveness of tested defenses	Under 30%	MCPSecBench, arXiv:2508.13220

The integration math deserves a concrete reading. The protocol does not make any single integration cheaper to write; an MCP GitHub server is roughly as much work as a bespoke GitHub adapter. What changes is reuse. That one server now works for every MCP host that will ever exist, which is why a public registry of more than 10,000 servers can exist at all. In a bespoke world there is no registry, because no adapter is portable.

[IMAGE: Two-line plot. X-axis number of tools N, Y-axis number of integrations to maintain, for a fixed M=5. One line is M×N rising steeply, the other is M+N rising gently. Shade the gap and label it "adapters you never have to write."]

A Concrete Example

Walk a single real interaction end to end. A developer asks their IDE assistant: "Has issue 412 been fixed, and if so, close it." The IDE is the host, with a GitHub MCP server connected over Streamable HTTP.

The session is already initialized, so the host has cached the server's tool list. It includes get_issue, list_commits, and close_issue, each with a JSON Schema. The host has formatted these into the model's tool-calling format. The model, reading the request, decides it first needs to inspect the issue. It emits a tool call, and the client serializes it:

{ "jsonrpc": "2.0", "id": 7, "method": "tools/call",
  "params": { "name": "get_issue", "arguments": { "number": 412 } } }

The server hits the GitHub API and returns structured content: the issue is open, titled "Null pointer in auth refresh," and the most recent linked commit message reads "fix: guard null token in refresh path." The result flows back into the model's context. Now the model has a decision the protocol deliberately routes through the host. Closing an issue is a side effect, so before the close_issue call executes, the host's consent policy intercepts it. The IDE shows the developer a confirmation: "Allow GitHub server to close issue 412." Only on approval does the client send:

{ "jsonrpc": "2.0", "id": 8, "method": "tools/call",
  "params": { "name": "close_issue", "arguments": { "number": 412 } } }

The table below tracks the state as the interaction proceeds.

Step	Actor	Action	State change
1	Model	Choose `get_issue(412)`	Pending read
2	Server	Return issue + commit	Context now holds issue status
3	Model	Choose `close_issue(412)`	Pending write
4	Host	Prompt user for consent	Awaiting approval
5	User	Approve	Write authorized
6	Server	Close issue on GitHub	Issue state open to closed

Two things in this trace are the whole point of the protocol. The GitHub server author never knew which IDE or model would call it, and the IDE author never wrote a line of GitHub-specific code. The consent prompt sits at the host because the host is the only party that can be trusted to mediate between a persuadable model and an action with consequences.

[IMAGE: Annotated JSON-RPC trace rendered as a figure. Show the two tools/call requests and their responses stacked, with margin annotations pointing at the id field (request/response pairing), the method, and a red flag on the close_issue call labeled "side effect, host consent required."]

Where It Breaks

The design choice that makes MCP easy, trusting a server's self-described metadata, is also its central security weakness. Because a tool's description is fed to the model to help it decide when to call the tool, that description is an injection surface. A malicious server can embed instructions inside a tool's description that the model reads as if they were part of its own guidance. This is tool poisoning, and security researcher Simon Willison flagged the broader class of prompt-injection exposure within months of MCP's rise (Willison, 2025).

The systematic picture is sobering. MCPSecBench, evaluated across Claude Desktop, GPT-4.1, and Cursor with each of 17 attack types run repeatedly, found that over 85% of catalogued attacks compromised at least one platform, and that the protection mechanisms tested were effective less than 30% of the time on average (Yang et al., 2025, MCPSecBench, arXiv:2508.13220). The benchmark also found meaningful variation between hosts: prompt-injection defenses that held on one platform failed on another, which means security is partly a host implementation property, not just a protocol property.

Several distinct failure modes recur. Rug pulls, where a tool's definition is benign at install time and silently mutates later to exfiltrate secrets, exploit the fact that nothing in the base protocol pins a tool's behavior to its approved description. Cross-server shadowing lets a malicious server intercept or override calls intended for a trusted one when both are connected to the same host. And the supporting tooling has had its own holes: CVE-2025-49596, a vulnerability in the MCP Inspector debugging tool, scored 9.4 on the CVSS scale because an unauthenticated instance allowed arbitrary command execution.

A subtler failure is not security but semantics. The model only knows about a tool what its description says. A poorly described tool, or two tools with overlapping descriptions, leads the model to pick wrong, pass malformed arguments, or loop. As the number of connected servers grows, the combined tool list also bloats the context window and degrades selection accuracy, an emerging operational limit that pushes teams toward dynamic tool filtering rather than exposing every tool at once.

[IMAGE: Threat-surface schematic of the host-client-server triangle with four red attack markers placed on the exact edges they exploit: tool poisoning on the server-to-model description path, rug pull on the server post-approval, cross-server shadowing between two clients, and token leakage on the auth path. Each marker carries a one-line caption.]

Alternative Designs

MCP is not the only way to connect models to the world, and it does not solve every adjacent problem. The honest comparison places it against the approaches it competes with and the ones it complements.

Approach	Strengths	Weaknesses	Best when
Bespoke function calling	Full control, minimal moving parts	Zero reuse, M×N glue, per-app rewrites	One app, a few stable tools
OpenAPI plus a generator	Huge existing API surface, mature tooling	Not built for stateful model interaction or consent	Wrapping existing REST APIs read-only
Framework tool abstractions	Rich orchestration, batteries included	Framework lock-in, tools rarely portable across frameworks	You live inside one agent framework
MCP	Vendor-neutral, portable servers, consent model	Young security story, metadata trust, context bloat	Many hosts and tools must interoperate
A2A (agent-to-agent)	Coordinates peer agents	Different layer, not a tool-access protocol	Agents must delegate to other agents

The most common confusion is MCP versus agent-to-agent protocols. They are not competitors. MCP standardizes the vertical link between an application and its tools and data. Agent-to-agent protocols standardize the horizontal link between autonomous agents that delegate work to each other. A mature system may speak both: an agent uses A2A to hand a subtask to a peer, and that peer uses MCP to actually touch a database. Framework tool abstractions like those in LangChain or LangGraph operate at yet another level; increasingly they consume MCP servers rather than replace them, treating MCP as the portable substrate beneath their orchestration.

How It Is Used in Practice

Adoption is the strongest evidence that the protocol solved a real problem rather than an imagined one. Anthropic shipped MCP support in Claude Desktop and released reference servers for filesystems, GitHub, Slack, and Postgres at launch. Through 2025 the protocol crossed vendor lines that rarely get crossed: OpenAI adopted MCP for its Agents tooling and ChatGPT, Google confirmed support for Gemini, and Microsoft built it into Copilot Studio and VS Code. Cursor, Replit, and Sourcegraph wired it into developer products. By Anthropic's December 2025 account there were more than 10,000 active public MCP servers in the wild.

Production deployment surfaces engineering concerns the spec only partly addresses. Transport choice is the first. The protocol defines two transports: stdio, where the host launches the server as a local subprocess and they speak over standard input and output, ideal for local tools with filesystem access; and Streamable HTTP, introduced in the 2025-03-26 revision to replace the earlier HTTP-plus-SSE design, which carries remote servers over a single proxy-friendly endpoint. Remote servers raised the authorization question the early spec dodged, and the 2025-06-18 revision answered it by adopting an OAuth 2.1 framework: clients use PKCE, discover authorization servers via RFC 9728 protected-resource metadata, and, critically, bind access tokens to a specific MCP server using RFC 8707 resource indicators so a token leaked to one server cannot be replayed against another (RFC 8707).

The latest revision, 2025-11-25, kept extending the enterprise surface: OpenID Connect discovery, icons for tools and resources, incremental scope consent, and an experimental tasks mechanism for tracking long-running, durable requests with polling and deferred results, a direct response to the reality that real agent actions can take minutes, not milliseconds (MCP Changelog, 2025-11-25). Operationally, teams report the same lessons repeatedly: gate every server behind explicit consent, prefer scoped tokens, do not connect untrusted servers, and filter the tool list rather than exposing dozens of tools that crowd the context and confuse selection.

[IMAGE: Adoption matrix heatmap. Rows are major vendors (Anthropic, OpenAI, Google, Microsoft, Cursor, Replit), columns are integration points (host client, reference servers, remote/OAuth, registry). Cells shaded by support level, with a legend.]

Insights Worth Remembering

The protocol's value is reuse, not efficiency. A single MCP server is no cheaper to build than a bespoke adapter; it is just portable to every host that will ever exist, which is what makes a public registry possible.

Standardization is a coordination win, and coordination wins are fragile until they cross vendor lines. MCP mattered the moment OpenAI and Google adopted a protocol authored by Anthropic, because a standard owned by one vendor is just an API.

The boundary that gives MCP its security model, servers cannot see the conversation or each other, is also the boundary attackers route around, by smuggling instructions through the one channel the host does trust: tool descriptions.

Consent belongs at the host because the host is the only component that sits between a persuadable model and an irreversible action. Any design that pushes that decision into the server or the model has misplaced trust.

The protocol layer and the orchestration layer are different problems. MCP connects an app to its tools; agent-to-agent protocols connect agents to each other. Conflating them produces architecture diagrams that never quite work.

Dated, versioned specifications with capability negotiation are why MCP could add OAuth and tasks without a flag day. The handshake that feels like boilerplate is what makes the protocol evolvable.

[IMAGE: Quote card rendered as a callout figure with the line "A standard owned by one vendor is just an API" set large, subcaptioned with the December 2025 hand-off to the Linux Foundation Agentic AI Foundation.]

Open Questions

Several questions remain genuinely unresolved rather than merely unfinished. The strongest evidence we have, from MCPSecBench and related work, shows that current defenses against tool poisoning and prompt injection are weak, but whether a protocol-level fix is even possible is open. Some researchers argue the trust model needs cryptographic attestation of tool behavior; others believe the mitigation must live in the host and the model, not the wire format. The evidence supports the claim that the problem is real and the defenses are immature. It does not yet tell us which layer should own the fix.

Scaling tool selection is an active engineering frontier. As hosts connect to dozens of servers, naively exposing every tool degrades both context budget and model accuracy. Dynamic discovery, where the host retrieves only relevant tools per request, is being explored, but there is no settled standard for it, and it reintroduces a ranking problem that itself can be gamed.

Governance is the most consequential near-term variable, and here the direction is clearer. With the December 2025 transfer of MCP to the Linux Foundation's Agentic AI Foundation, alongside Block's goose and OpenAI's AGENTS.md, the protocol moved to neutral stewardship with a governing board separate from any single vendor's roadmap. Whether neutral governance accelerates careful standardization or slows it under committee dynamics is, for now, a reasonable expectation in either direction rather than a settled fact.