Building Standardized AI Tools with the Model Context Protocol (MCP)

Dieser Artikel ist auch auf Deutsch verfügbar

As I mentioned in my column about Spring AI at the end of 2024, the entire AI ecosystem is evolving at a rapid pace. AI-based applications, such as agent systems, must interact with their environment to obtain necessary data or execute actions. We therefore need to integrate these systems into an existing landscape of other systems.

We could implement these integrations specifically for our use case. However, from the perspective of replaceability and avoiding duplicate effort, it makes more sense to use a standard. This is exactly where the Model Context Protocol (MCP) comes into play.

Model Context Protocol

MCP was launched by Anthropic in late November 2024. The protocol primarily consists of a specification and corresponding Software Development Kits (SDKs) in various languages. Additionally, there is a repository that collects different server implementations based on MCP.

At its core, the specification defines three main elements. The Host is the actual application that can be extended with features via MCP. The host creates and manages one or more clients. It ensures that data requested by clients is only shared after explicit user consent and is responsible for overall coordination.

The Clients created by the host manage communication with exactly one server. They establish a stateful session and communicate with the server on behalf of the host.

The Server provides specific functionality that the host can use as needed. Ideally, each server focuses on exactly one task. A server can run either as a process started by the client or as a standalone service.

The specification is based on four guiding principles:

Building a server should be as simple as possible.
Different servers should be able to work together.
Servers should have limited access to user conversations and shouldn’t access other servers.
New functionality should be extensible incrementally on both server and client sides without breaking existing implementations.

In addition to providing a range of features that implement the actual functionality, the specification defines a protocol for communication between client and server, which also influences the interface between host and client.

The MCP Protocol

The protocol is based on JSON-RPC specification version 2.0. This essentially defines two JSON objects, Request and Response, used for function calls and their responses.

A Request consists of four fields: jsonrpc, method, params, and id. The jsonrpc field is a String specifying the protocol version (always 2.0 in this version). The method field contains the name of the function to be called. params holds the function parameters as an Object, meaning parameters are bound by name rather than order, and can be complex objects beyond primitive data types. This field can be omitted if the function doesn’t require parameters. The id field associates calls with responses - the client assigns a string or integer value, and the response contains the same value. If omitted, the request is considered a Notification (similar to a void call in Java), which expects no response.

Like the Request, the Response contains the jsonrpc field with the value 2.0 and the id field matching the request. Additionally, a Response contains either a result or error field. The result field (type Object) indicates a successful call and contains the result. In case of an error, the error field is used instead, containing a code, a message, and optionally data.

Listing 1 shows all four defined cases as TypeScript definitions.

// Request
{
  jsonrpc: "2.0";
  id: string | number;
  method: string;
  params?: {
    [key: string]: unknown;
  };
}

// Notification
{
  jsonrpc: "2.0";
  method: string;
  params?: {
    [key: string]: unknown;
  };
}

// Successful Response
{
  jsonrpc: "2.0";
  id: string | number;
  result?: {
    [key: string]: unknown;
  }
}

// Error Response
{
  jsonrpc: "2.0";
  id: string | number;
  error?: {
    code: number;
    message: string;
    data?: unknown;
  }
}

The protocol defines a lifecycle with three phases. In the Initialization phase, the client establishes a connection to the server by calling the initialize function, as shown in Listing 2.

{
  "jsonrpc": "2.0",
  "id": "4711",
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {
      "roots": {
        "listChanged": true
      },
      "sampling": {}
    },
    "clientInfo": {
      "name": "SomeClient",
      "version": "1.2.3"
    }
  }
}

Listing 2: initialize Request

Here, the client communicates information about the protocol version and supported features to the server. The server responds with its supported features, as shown in Listing 3.

{
  "jsonrpc": "2.0",
  "id": "4711",
  "result": {
    "protocolVersion": "2024-11-05",
    "capabilities": {
      "prompts": {
        "listChanged": true
      },
      "tools": {}
    },
    "serverInfo": {
      "name": "MyServer",
      "version": "0.8.15"
    }
  }
}

Listing 3: initialize Response

If the protocol versions don’t match, the client terminates the connection. If they match, the client sends a notification to complete this phase, as shown in Listing 4.

{
  "jsonrpc": "2.0",
  "method": "notifications/initialized"
}

Listing 4: initialize Notification

The session then enters the Operation phase, where client and server communicate based on previously negotiated features.

To end the session, the Shutdown phase begins, where one of the two sides (usually the client) terminates the connection. The exact process depends on the transport mechanism used.

MCP currently provides two transport mechanisms: stdio and HTTP with SSE. With stdio, the client starts the server as a child process and uses its standard input to send requests. The server responds via standard output, with error output used for log messages. The connection terminates by closing standard input and waiting for the child process to end. If necessary, the client can send a SIGTERM followed by a SIGKILL after a waiting period.

With HTTP with SSE, the client opens a Server-Sent Event connection to the server and receives an endpoint event containing a URI. The client then uses this URI to send requests to the server via HTTP POST, and the server responds through the established SSE connection.

In addition to messages, lifecycle, and transport mechanisms, the protocol specifies versioning using the date of the last non-backward compatible change.

The protocol also defines three additional functions - Ping, Cancellation, and Progress - to improve communication between client and server.

Now let’s examine the core of MCP: the supported features.

The MCP Features

MCP features can be provided by both servers and clients. Let’s start with server features.

The first feature is providing parameterizable Prompts. During initialization, the server informs the client that it supports this feature. With the optional property listChanged, it can also notify the client when available prompts change. The client can retrieve a list of available prompts via a prompts/list request, as shown in Listing 5.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "prompts/list"
}

Listing 5: prompts/list

To get the content of a specific prompt, the client can use the prompts/get request, as shown in Listing 6.

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "prompts/get",
  "params": {
    "name": "write_linkedin_post",
    "arguments": {
      "content": "…"
    }
  }
}

Listing 6: prompts/get

The idea is that the host offers these prompts to users for selection through a list or specific commands. When a user selects a prompt, the host retrieves and executes it.

Servers can also provide Tools to enable function calls. The principle is identical to prompts, with requests tools/list and tools/call and the notification notifications/tools/list_changed. Although the LLM drives tool calls, MCP recommends that the host at least inform or preferably ask the user before such calls to ensure they’re safe and won’t accidentally leak sensitive data through function parameters.

The last server-side feature is Resources, which allow the host to access file or website contents. Resources can be listed via resources/list and retrieved via resources/read. Beyond the notifications/resources/list_changed notification that reports added or removed resources, clients can subscribe to changes in specific resources by calling resources/subscribe, after which the server sends notifications/resources/updated notifications when changes occur.

Clients can also provide features to servers. Currently, two features are supported. With the Roots feature, the client shares a file system path as a root with its server upon request through the roots/list request. This allows a client to provide a starting directory for resources from the file system - for example, an IDE as host might share the current project directory.

The second client-side feature is Sampling. This gives the server access to an LLM by sending the sampling/createMessage request to the client, as shown in Listing 7.

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "sampling/createMessage",
  "params": {
    "messages": [
      {
        "role": "user",
        "content": {
          "type": "text",
          "text": "Summarize the following content: …"
        }
      }
    ],
    "modelPreferences": {
      "costPriority": 0.3,
      "intelligencePriority": 0.8,
      "speedPriority": 0.5
    },
    "systemPrompt": "You are a marketing manager.",
    "maxTokens": 23
  }
}

Listing 7: sampling/createMessage

As with function calls, users should be clearly informed or asked for permission before such calls. Beyond improved security, this approach allows the server to remain independent of the specific LLM provider. To still give the server influence over model selection, the three priorities shown in Listing 7 can be communicated to the client, which then selects an appropriate available model.

The Java SDK

In collaboration with the Spring AI team, an official MCP SDK for Java is now available and used by Spring AI to offer MCP support.

The SDK is divided between client and server components, with both synchronous and asynchronous versions available. On the client side, the SDK API closely follows the specification, as shown in Listing 8.

// Use stdio transport
var transport = new StdioClientTransport(
    ServerParameters.builder("npx")
        .args("-y", "@modelcontextprotocol/server-everything", "dir")
        .build());

// Create client with capabilities
var client = McpClient.sync(transport)
    .capabilities(ClientCapabilities.builder()
        .roots(true)      // Enable roots capability
        .build())
    .build();

// Initialize session
client.initialize();

// Use client
// client.listTools();
// client.callTool(new CallToolRequest(…));
// client.listResources();
// client.readResource(new ReadResourceRequest(…));
// client.listPrompts();
// client.getPrompt(new GetPromptRequest(…));

// Disconnect from server
client.closeGracefully();

The server-side implementation also adheres closely to the specification, as shown in Listing 9.

// Create server with capabilities
var server = McpServer.sync(transport)
    .serverInfo("MyServer", "0.8.15")
    .capabilities(ServerCapabilities.builder()
        .prompts(true)       // Enable prompt support
        .build())
    .build();

// Prompts
var prompt = new McpServerFeatures.SyncPromptRegistration(
    new Prompt("write_linkedin_post",
               "Writes a LinkedIn post.", List.of(
        new PromptArgument("content",
                           "The content to write about.", true)
    )),
    request -> {
        // Prompt implementation
        return new GetPromptResult(description, messages);
    }
);

server.addPrompt(prompt);
// server.addTool(syncToolRegistration);
// server.addResource(syncResourceRegistration);

// Shutdown server
server.close();

Both client and server sides also support HTTP and SSE transport methods.

Article