What’s MCP and How Does it Work?
You may consider MCP just like the USB-C port on a laptop computer. One port offers you entry to a number of features akin to charging, information switch, show output, and extra, without having separate connectors for every function.
In an identical approach, the Mannequin Context Protocol supplies an ordinary, safe, real-time communication interface that permits AI techniques to attach with exterior instruments, API companies, and information sources.
Not like conventional API integrations, which require separate code, authentication flows, documentation, and ongoing upkeep for every connection, MCP supplies a single unified interface. You write the mixing as soon as, and any AI mannequin that helps MCP can use it immediately. This makes instrument improvement extra constant and scalable throughout completely different environments.
Why It Issues
Earlier than MCP:
Each AI app (M) wanted {custom} code to attach with each instrument (N), leading to M × N distinctive integrations.
There was no shared protocol throughout instruments and fashions, so builders needed to reinvent the wheel for every new connection.
After MCP:
You may outline or expose a number of instruments inside a single MCP server.
Any AI app that helps MCP can use these instruments immediately.
Integration complexity drops to M + N, since instruments and fashions communicate a shared protocol.
Structure
MCP follows a client-server structure:
Shopper: An AI utility (akin to an LLM agent, RAG pipeline, or chatbot) that should carry out exterior duties.
Server: Hosts callable instruments akin to “question CRM,” “fetch Slack messages,” or “run SQL.” These instruments are invoked by the shopper and return structured responses.
The shopper sends structured requests to the MCP server. The server performs the requested operation and returns a response that the mannequin can perceive.
On this tutorial, you will notice how one can construct a {custom} MCP server utilizing FastMCP, take a look at it regionally, after which add and deploy it within the Clarifai platform.
FastMCP is a high-level Python framework that takes care of the low-level protocol particulars. It permits you to give attention to defining helpful instruments and exposing them as callable actions, with out having to jot down boilerplate code for dealing with the protocol.
Why Construct a Customized MCP Server?
There are already many ready-to-use MCP servers out there. For instance, you’ll find MCP servers constructed particularly to attach with instruments like GitHub, Slack, Notion, and even general-purpose REST APIs. These servers expose predefined instruments that work nicely for frequent use instances.
Nevertheless, not each workflow might be lined by current servers. In lots of real-world eventualities, you will have to construct a {custom} MCP server tailor-made to your particular atmosphere or utility logic.
You must take into account constructing a {custom} server when:
You could join with inside or unsupported instruments: In case your group depends on proprietary techniques, inside APIs, or {custom} workflows that are not publicly uncovered, you’ll want a {custom} MCP server to interface with them. Whereas MCP servers exist for a lot of frequent instruments, there received’t be one out there for each system you need to combine. A {custom} server permits you to securely wrap inside endpoints and expose them by means of a standardized, AI-accessible interface.
- You want full management over instrument habits and construction: Off-the-shelf MCP servers prioritize flexibility, however when you require {custom} logic, validation, response shaping, or tightly outlined schemas tailor-made to your online business guidelines, constructing your personal instruments offers you clear, maintainable management over each performance and construction.
You need to handle efficiency or deal with giant workloads: Working your personal MCP server permits you to select the deployment atmosphere and allocate particular GPU, CPU, and reminiscence sources to match your efficiency and scaling wants.
Now that you’ve got seen why constructing a {custom} MCP server might be obligatory, let’s stroll by means of the way to construct one from scratch.
Construct a Customized MCP Server with FastMCP
On this part, let’s construct a {custom} MCP server utilizing the FastMCP framework. This MCP server comes with three instruments designed for blog-writing duties:
Run a real-time search to search out prime blogs on a given subject
Extract content material from URLs
Carry out key phrase analysis with autocomplete and tendencies information
Let’s first construct this regionally, take a look at it, after which deploy it to the Clarifai platform the place it could possibly run securely, scale routinely, and serve any MCP-compatible AI agent.
What Instruments Will This MCP Server Expose?
This server affords three instruments (features the LLM can invoke):
multi_engine_search
Queries a search engine (like Google) utilizing SERP API and returns the highest 5 article URLs.extract_web_content_from_links
Makes use ofnewspaper3k
to extract readable content material from a listing of URLs.keyword_research
Performs light-weight Search engine marketing evaluation utilizing SERP API’s autocomplete and tendencies options.
Step 1: Set up Dependencies
Set up the required Python packages
Additionally, set your Clarifai Private Entry Token (PAT) as an atmosphere variable:
Step 2: Mission Construction
To create a sound Clarifai MCP server challenge, your listing ought to comply with this construction:
your_model_directory/
├── 1/
│ └── mannequin.py
├── necessities.txt
├── config.yaml
Let’s break that down:
1/mannequin.py
: Your core MCP logic goes right here. You outline and register your instruments utilizing FastMCP.necessities.txt
: Lists Python packages wanted by the server throughout deployment.config.yaml
: Comprises metadata and configuration settings wanted for importing the mannequin to Clarifai.
It’s also possible to generate this template utilizing the Clarifai CLI:
Step 3: Implement mannequin.py
Right here is the whole MCP server logic:
Understanding the Elements
Let’s break down every part of the above mannequin.py
file
a. Initialize the FastMCP Server
The server is initialized utilizing the FastMCP
class. This occasion acts because the central hub that registers all instruments and serves requests. The identify you assign to the server helps distinguish it throughout debugging or deployment.
Optionally, it’s also possible to move parameters like directions
, which describe what the server does, or stateless_http
, which permits the server to function over stateless HTTP for less complicated, light-weight deployments.
b. Outline Instruments Utilizing Decorators
The ability of an MCP server comes from the instruments it exposes. Every instrument is outlined as a daily Python perform and registered utilizing the @server.instrument(...)
decorator. This decorator marks the perform as callable by LLMs by means of the MCP interface.
Every instrument contains:
A singular identify (used because the instrument ID)
A brief description that helps fashions perceive when to invoke the instrument
Clearly typed and described enter parameters utilizing Python kind annotations and
pydantic.Subject
This instance contains three instruments:
multi_engine_search: Makes use of SerpAPI to seek for articles or blogs. It accepts a question and choices like search engine, location, and gadget kind. Returns a listing of prime URLs.
extract_web_content_from_links: Takes a listing of URLs and makes use of the
newspaper3k
library to extract principal content material from every web page. Returns the extracted textual content (truncated for brevity).keyword_research: Combines autocomplete and tendencies APIs to counsel related key phrases and rank them by recognition. Helpful for Search engine marketing-focused content material planning.
These instruments can work independently or be chained collectively to create agent workflows like discovering article sources, extracting content material, and figuring out Search engine marketing key phrases.
c. Outline Clarifai’s Mannequin Class
The custom-named mannequin class serves as the mixing level between your MCP server and the Clarifai platform.
You could outline it by subclassing Clarifai’s MCPModelClass
and implementing the get_server()
technique. This technique returns the FastMCP
server occasion (akin to server
) that Clarifai ought to use when working your mannequin.
When Clarifai runs the mannequin, it calls get_server()
to load your MCP server and expose its outlined instruments and capabilities to LLMs or different brokers.
Step 4: Outline config.yaml
and necessities.txt
To deploy your {custom} MCP server on the Clarifai platform, you want two key configuration recordsdata: config.yaml
and necessities.txt
. Collectively, they outline how your server is constructed, what dependencies it wants, and the way it runs on Clarifai’s infrastructure.
The config.yaml
file is used to configure the construct and deployment settings for a {custom} mannequin (or, on this case, a MCP server) on the Clarifai platform. It tells Clarifai the way to construct your mannequin’s atmosphere and the place to put it inside your account.
Understanding the config.yaml
File
build_info
This part specifies the Python model that Clarifai ought to use to construct the atmosphere on your MCP server. It ensures compatibility along with your dependencies. Clarifai presently helps Python 3.11 and three.12 (with 3.12 being the default). Choosing the proper model helps keep away from points with libraries like pydantic v2
, fastmcp
, or newspaper3k
.
inference_compute_info
This defines the compute sources allotted when your MCP server is working inference — in different phrases, when it’s dwell and responding to agent requests.
cpu_limit: 1
means the mannequin will get one CPU core for its execution.cpu_memory: 1Gi
allocates 1 gigabyte of RAM.num_accelerators: 0
specifies that no GPUs or different accelerators are wanted.
This setup is normally sufficient for light-weight servers that simply make API calls, run information parsing, or name Python instruments. In case you’re deploying heavier fashions (like LLMs or imaginative and prescient fashions), you possibly can configure GPU-backed or high-performance compute utilizing Clarifai’s Compute Orchestration.
mannequin
This part registers your MCP server throughout the Clarifai platform.
app_id
teams your server beneath a selected Clarifai app. Apps act like logical containers for fashions, datasets, and workflows.id
is your mannequin’s distinctive identifier. That is how Clarifai refers to your MCP server within the UI and API.model_type_id
have to be set tomcp
, which tells the platform this can be a Mannequin Context Protocol server.user_id
is your Clarifai username, used to affiliate the mannequin along with your account.
Each MCP mannequin should dwell inside an app. An app acts as a self-contained challenge for storing and managing information, annotations, fashions, ideas, datasets, workflows, searches, modules, and extra.
necessities.txt
: Outline Dependencies
The necessities.txt
file lists all of the Python packages your MCP server is dependent upon. Clarifai makes use of this file throughout deployment to routinely set up the mandatory libraries, guaranteeing your server runs reliably within the specified atmosphere.
Right here’s the necessities.txt
for the {custom} MCP server we’re constructing:
This setup contains:
clarifai
,mcp
, andfastmcp
for MCP compatibility and deploymentanyio
andrequests
for networking and async assistlxml
andnewspaper3k
for content material extraction and HTML parsinggoogle-search-results
for integrating SERP APIs
Ensure this file is positioned within the root listing alongside config.yaml
. Clarifai will routinely set up these dependencies throughout deployment, guaranteeing your MCP server is production-ready.
Take a look at the MCP Server
Step 5: Take a look at the MCP Server Domestically
Earlier than deploying to manufacturing, at all times take a look at your MCP server regionally to make sure your instruments work as anticipated.
Choice 1: Use Native Runners
Consider native runners like “ngrok for AI fashions.” They allow you to simulate your deployment atmosphere, route actual API calls to your machine, and debug in actual time — all with out pushing to the cloud.
To begin:
clarifai mannequin local-runner
This may:
Spin up your MCP server regionally
Simulate real-world requests to your instruments
Allow you to validate outputs and catch errors early
Try the Native Runner information to discover ways to configure the atmosphere and run your fashions regionally.
Choice 2: Run Automated Unit Checks with test-locally
For a quicker suggestions loop throughout improvement, you possibly can write take a look at instances immediately in your mannequin.py
by implementing a take a look at()
technique in your mannequin class. This allows you to validate logic with out spinning up a dwell server.
Run it utilizing:
clarifai mannequin test-locally --mode container
This command:
Launches a neighborhood container
Routinely calls the
take a look at()
technique you’ve outlinedRuns assertions and logs ends in your terminal
Yow will discover the complete test-locally information right here to correctly arrange your atmosphere and run native checks.
Add and Deploy MCP Server
After you have configured your mannequin.py
, config.yaml
, and necessities.txt
, the ultimate step is to add and deploy your MCP server in order that it could possibly serve requests from brokers in actual time.
Step 6: Add the Mannequin
From the foundation listing of your challenge, run the next command:
clarifai mannequin add
This command uploads your MCP server to the platform, utilizing the configuration you laid out in your config.yaml
. As soon as the add is profitable, the CLI will return the general public MCP endpoint:
https://api.clarifai.com/v2/ext/mcp/v1/customers/YOUR_USER_ID/apps/YOUR_APP_ID/fashions/YOUR_MODEL_ID
This URL is the inference endpoint that brokers will name when invoking instruments out of your server. It is what connects your code to real-world use.
Step 7: Deploy on Compute
Importing your server will register it to the Clarifai app you outlined within the config.yaml
file. To make it accessible and able to serve requests, it’s essential to deploy it to devoted compute.
Clarifai’s Compute Orchestration, permits you to create and handle your personal compute sources. It brings the flexibleness of serverless autoscaling to any atmosphere — whether or not you are working on cloud, hybrid, or on-prem {hardware}. It dynamically scales sources to satisfy workload calls for whereas providing you with full management over how and the place your fashions run.
To deploy your MCP server, you’ll first have to:
Create a compute cluster – a logical group to arrange your infrastructure.
Create a node pool – a set of machines along with your chosen occasion kind.
Choose an occasion kind – since MCP servers are usually light-weight, a fundamental CPU occasion is ample.
Deploy the MCP server – as soon as your compute is prepared, you possibly can deploy your mannequin to the chosen cluster and node pool.
This course of ensures that your MCP server is at all times on, scalable, and in a position to deal with real-time requests with low latency.
You may comply with this information or this tutorial to discover ways to create your personal devoted compute atmosphere and deploy your mannequin to the platform.
Work together With Your MCP Server
As soon as your MCP server is deployed, you possibly can work together with it utilizing a FastMCP shopper. This lets you checklist the instruments you have registered and invoke them programmatically utilizing your server’s endpoint.
Right here’s how the shopper works:
1. Shopper Setup
You’ll use the fastmcp.Shopper
class to hook up with your deployed MCP server. This handles instrument itemizing and invocation over HTTP.
2. Transport Layer
The shopper makes use of StreamableHttpTransport
to speak with the server. This transport is well-suited for many deployments and allows clean interplay between your app and the server.
3. Authentication
All requests are authenticated utilizing your Clarifai Private Entry Token (PAT), which is handed as a bearer token within the request header.
4. Instrument Execution Movement
Within the instance shopper, three instruments from the MCP server are invoked:
multi_engine_search: Takes a question and returns prime weblog/article hyperlinks utilizing SerpAPI.
extract_web_content_from_links: Downloads and parses article content material from given URLs utilizing
newspaper3k
.keyword_research: Performs key phrase analysis utilizing autocomplete and tendencies information to return high-potential key phrases.
Every instrument is invoked through shopper.call_tool(...)
, and outcomes are parsed utilizing Python’s json
module to show readable output.
Now that your {custom} MCP server is dwell, you possibly can combine it into your AI brokers. The brokers can use these instruments to finish duties extra successfully. For instance, they will use real-time search, content material extraction, and key phrase evaluation to jot down higher blogs or create extra related content material.
Conclusion
On this tutorial, we constructed a {custom} MCP server utilizing FastMCP and deployed it to devoted compute on Clarifai. We explored what MCP is, why constructing a {custom} server issues, the way to outline instruments, configure the deployment, and take a look at it regionally earlier than importing.
Clarifai takes care of the deployment atmosphere together with provisioning, scaling, and versioning so you possibly can focus totally on constructing instruments that LLMs and Brokers can name securely and reliably.
You should utilize the identical course of to deploy your personal {custom} fashions, open supply fashions, or fashions from Hugging Face or different suppliers. Clarifai’s Compute Orchestration helps all of those. Try the docs or tutorials to get began.