Building Model Context Protocol Servers for LLM Agents

Building Model Context Protocol Servers for LLM Agents
Model Context Protocol (MCP) Server for LLM Agents

Model Context Protocol (MCP) Server for LLM Agents

This document provides a summary of key concepts and practical instructions for setting up and integrating a Model Context Protocol (MCP) server,

1. Introduction to Model Context Protocol (MCP)

The Model Context Protocol (MCP), released by Anthropic in November 2024, addresses significant challenges in integrating Large Language Models (LLMs) with external tools. Historically, different frameworks and applications have used proprietary methods for declaring and interacting with tools, leading to redundant integration efforts whenever an AI capability is desired.

Key Idea: MCP standardizes how LLMs communicate with tools, allowing developers to

"define your tool server once and use it everywhere."

This significantly streamlines the process of exposing AI capabilities to various agents and applications.

2. Core Problem Solved by MCP

The primary pain point MCP alleviates is the

"repeatedly creating integrations every time you want to use an AI capability."

Before MCP, if an LLM agent needed to interact with an external function (e.g., a machine learning model, a database, an API), each agent or framework would require a custom integration for that specific tool. MCP introduces a universal communication layer for tools.

3. Building an MCP Server: A Three-Phase Approach

The source demonstrates a practical, step-by-step guide to building an MCP server, integrating an existing machine learning API (specifically, an employee churn prediction model built with FastAPI).

Phase 1: Building the Server

This phase focuses on setting up the development environment and writing the core server logic.

  • Project Setup:
    • Initialize a new project using uv init (e.g., uv init employee).
    • Navigate into the project directory (cd employee).
    • Create and activate a virtual environment (uv venv and activate command).
  • Dependency Installation:
    • Install necessary packages: uv add mcp-cli requests. The mcp-cli package provides the core Model Context Protocol library, primarily used via its Python SDK.
  • Server File Creation:
    • Create a Python file for the server logic (e.g., touch server.py).
  • Server Implementation (server.py):
    • Import Dependencies: Essential imports include FastMCP from mcp.server.fastmcp, json, requests, and List from typing. FastMCP serves as the

      "crux of our entire server."

    • Instantiate MCP Server: Create an instance of FastMCP, giving it a name (e.g., mcp = FastMCP(name="churn_and_burn")).
    • Define a Tool: Tools are defined using the @mcp.tool decorator, wrapping a Python function that encapsulates the logic for interacting with the external API.
    • Example Tool: predict_churn:
      @mcp.tool
      def predict_churn(data: List[dict]):
          """
          This tool predicts whether an employee will churn or not.
      
          Arguments:
              data: employee attributes which are used for inference.
      
          Example payload:
          [{
              "years_at_company": 2,
              "employee_sat": 0.8,
              "position": "Junior Engineer",
              "salary": 70000
          }]
          Return value: Returns either 1 (churn) or 0 (no churn).
          """
          url = "http://localhost:8000/predict" # Assuming FastAPI model is running here
          headers = {"Content-Type": "application/json"}
          response = requests.post(url, data=json.dumps({"instances": data}), headers=headers)
          response.raise_for_status() # Raise an exception for HTTP errors
          return response.json()

      Decorated with @mcp.tool.

      Takes data: List[dict] as input, representing employee attributes.

      Includes a comprehensive docstring that serves as the tool's description for LLMs. This docstring specifies:

      • Purpose:

        "This tool predicts whether an employee will churn or not."

      • Arguments:

        "data: employee attributes which are used for inference."

      • Example Payload: A list containing a dictionary with fields like years_at_company, employee_sat, position, salary.
      • Return Value:

        "Returns either 1 (churn) or 0 (no churn)."

      The function logic extracts the payload, makes a requests.post call to the external FastAPI endpoint, and returns the JSON response.

    • Run the Server: The server is initiated using mcp.run(transport="STDIO") within an if __name__ == "__main__": block.

Phase 2: Testing the Server

After building, the next crucial step is to verify the server's functionality and accessibility.

  • Starting the Dev Server: Execute uv run mcp dev server.py from the project directory. This command starts the MCP Inspector, a local web interface for testing tools.
  • Accessing the MCP Inspector: The Inspector provides a URL (e.g., http://127.0.0.1:8000). Connecting to this URL in a browser allows interaction with the deployed MCP server.
  • Listing and Testing Tools: Within the Inspector, navigate to the "Tools" section and select "List tools" to see the predict_churn tool. The Inspector allows passing a JSON payload to the tool and executing it, verifying that the external API call is made and the correct prediction is returned.
  • Transport Types: The source highlights two primary transport types for MCP servers:
    • STDIO (Standard Input/Output): Ideal for

      "connecting with local files or local tools"

      and desktop applications (e.g., Cursor).
    • SSE (Server-Sent Events): More suitable for

      "client-server related"

      interactions. The example uses STDIO due to its simplicity for local setup.

Phase 3: Adding it into an Agent

This phase demonstrates how to integrate the newly built MCP server into an LLM agent framework.

  • Agent Framework: The example uses the BeeAI framework, with Ollama (specifically the Granite 3.1 dense 8B parameter model) as the underlying LLM.
  • Agent Configuration: The agent is configured to connect to the MCP server by specifying the command to run the server. This command is essentially uv run server.py, with the correct file path to the server.py file. The server's transport is set to STDIO to match the agent's expected communication method.
  • Agent Invocation: When the agent is provided with an employee sample and asked

    "Will this particular employee churn?",

    it internally recognizes the need to use the predict_churn tool.
  • Agent Output: The agent's thought process is logged:

    "The user wants to know if the employee will churn based on their attributes. I need to use the predict churn tool."

    After executing the tool, it interprets the prediction (e.g.,

    "prediction of one indicating that the employee should churn"

    ) and provides a natural language response:

    "This employee is predicted to churn."

4. Observability and Interoperability

The briefing emphasizes two important aspects:

  • Observability: To track tool calls, developers can import logging and add a specific logging line, which

    "will give you the ability to see every tool call in your server logs."

  • Interoperability: MCP's core benefit is its universality. The source proves this by demonstrating the same MCP server can be used with different clients, such as Cursor. By running a simple command (e.g., mcp agent install with the server command) and configuring Cursor, the agent can access and utilize the predict_churn tool, reaffirming the mantra:

    "Same server, MCP everywhere."

5. Key Takeaways and Benefits of MCP

  • Standardization: MCP provides a unified way for LLMs to interact with tools, eliminating the need for bespoke integrations for each framework or application.
  • Simplified Integration: Developers can

    "define your tool server once and use it everywhere,"

    significantly reducing development effort and complexity when deploying AI capabilities.
  • Tool Agnostic: The protocol itself is tool-agnostic; it standardizes the interface to tools, not the tools themselves.
  • Improved Agent Capabilities: By providing agents with a consistent way to discover and invoke tools, MCP enhances their ability to perform complex tasks requiring external data or computation.
  • Observability: Built-in logging features allow developers to monitor tool usage and agent interactions.
  • Interoperability: The same MCP server can be seamlessly integrated across various LLM agents, applications, and platforms (e.g., BeeAI, Cursor), demonstrating true

    "plug-and-play"

    functionality.
Model Context Protocol (MCP) Server for LLM Agents: FAQ

Model Context Protocol (MCP) Server for LLM Agents: FAQ

What is the Model Context Protocol (MCP) and why was it created?

The Model Context Protocol (MCP) is a standard released by Anthropic in November 2024 designed to standardize how Large Language Models (LLMs) interact with external tools. It was created to address a significant pain point in AI agent development: the need to repeatedly create integrations for tools because every framework or application tends to have its own unique way of declaring them. MCP aims to solve this by allowing developers to define a tool server once and use it universally across different AI capabilities and agents.

How does MCP simplify AI integration for LLM agents?

MCP simplifies AI integration by providing a standardized interface for LLMs to access and utilize tools. Instead of creating custom integrations for each LLM framework or application, developers can build an MCP server that exposes their tools in a consistent manner. This means that once a tool server is built using MCP, any LLM agent capable of understanding the protocol can interact with those tools, regardless of the client's specific framework, saving significant development time and effort.

What are the key components involved in building an MCP server?

Building an MCP server involves several key components and steps. First, you need to set up a Python virtual environment and install necessary dependencies, including the mcp-cli package and requests. The core of the server is built using the FastMCP class from mcp.server.fastmcp, which serves as the foundation for defining and exposing tools. Tools themselves are created using a decorator (@mcp.tool) that wraps a Python function, allowing it to define input arguments, docstrings (including examples of expected payloads), and specify how it interacts with an underlying API or service (e.g., making a requests.post call). Finally, the server needs to be run, specifying a transport type like STDIO (Standard Input/Output) for local interaction or SSE (Server-Sent Events) for client-server communication.

How can you test an MCP server and its exposed tools?

An MCP server can be tested using the MCP Inspector. After starting the development server (e.g., using uv run mcp dev server.py), the Inspector provides a web-based interface. Developers can connect to their MCP server through this interface, list the available tools, and directly test them by providing example input payloads. The Inspector allows you to see the output from the tool, confirming that it's functioning as expected and accurately making calls to the underlying services.

What are the different transport types supported by MCP and when would you use them?

MCP supports different transport types for communication between the LLM agent and the tool server. The two primary types mentioned are:

  • STDIO (Standard Input/Output): This is typically used for local connections or when interacting with local files and tools. It's often suitable for desktop applications or scenarios where the agent and server are running on the same machine.
  • SSE (Server-Sent Events): This transport type is more suitable for client-server relationships where continuous communication or real-time updates are needed. It's generally preferred for more complex, distributed applications.

The choice of transport type depends on the deployment scenario and how the LLM agent will communicate with the MCP server.

Can an MCP server integrate with existing machine learning APIs?

Yes, an MCP server can integrate seamlessly with existing machine learning APIs. The provided example demonstrates this by building an MCP tool that makes a POST request to a pre-existing machine learning API (built with Fast API) for predicting employee churn. The MCP tool acts as an intermediary, taking input from the LLM agent, formatting it for the existing API, and returning the API's prediction back to the agent. This highlights MCP's ability to expose existing functionalities as standardized tools for LLM agents.

How can an MCP server be integrated into an LLM agent?

Integrating an MCP server into an LLM agent involves configuring the agent to communicate with the server using the appropriate transport type and command. For example, using a framework like BeeAI, you would specify the command to run the MCP server (e.g., uv run server.py) and the server's file path as part of the agent's configuration. Once configured, the agent can then identify and call the tools exposed by the MCP server, passing the necessary data and receiving responses, allowing it to incorporate external functionalities into its decision-making process.

How can you achieve observability for tool calls within an MCP server?

Observability for tool calls within an MCP server can be achieved by incorporating logging. By importing the logging module and adding a specific line of code (e.g., logging.basicConfig(level=logging.INFO) or a similar configuration) to your server script, you can enable the server to log every tool call. This allows developers to monitor and track when specific tools are being used, providing valuable insights into the agent's behavior and the server's activity.

Posts Gallery

Posts Gallery

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top