Skip to content

MCP in Practice: Building Real AI Integrations

MCP in Practice: Building Real AI Integrations

Beyond the basics

If you've read an intro to MCP, you know the pitch -- a standard protocol for connecting AI agents to tools. USB for AI. But knowing what MCP is and actually building a production MCP server are very different things. This post covers the implementation side: how to build servers that work well, handle edge cases, and don't fall over in production.

Picking your language

The two best-supported SDKs right now are Python and TypeScript. Both are official, both get updates quickly.

TypeScript is the better choice if your server is lightweight -- wrapping REST APIs, doing file operations, proxying requests. The async model maps cleanly to MCP's request-response flow, and the SDK has strong typing for tool schemas.

Python makes more sense when your server needs to do heavy lifting -- calling ML models, processing data with pandas/numpy, or integrating with Python-heavy infrastructure. The Python SDK uses asyncio under the hood, so you get the same non-blocking behavior.

Pick whichever matches your team and your server's dependencies. Don't overthink it.

Designing tool schemas

This is where most MCP servers succeed or fail. A good tool schema is the difference between an agent that uses your server effectively and one that hallucinates parameters or misunderstands what a tool does.

The pattern that works: treat each tool like a well-designed API endpoint. Give it a clear name that describes the action (query_database, not db_op). Write a description that tells the model when to use it and when not to. Define input parameters with types, descriptions, and constraints.

A few things I've learned the hard way. Keep parameter counts low -- three to five is the sweet spot. Models get confused with ten parameters. Use enums where possible instead of free-text strings. If a tool accepts a query language, specify which one in the description. "Accepts PostgreSQL-compatible SQL" is much better than "accepts a query."

Group related operations into separate tools rather than making one mega-tool with a mode parameter. A search_issues tool and a create_issue tool will get used more accurately than an issues_operation tool with an action field.

Authentication and authorization

MCP doesn't prescribe a specific auth mechanism, which means you need to figure this out yourself. There are a few patterns that have emerged.

For local stdio servers -- the ones that run as subprocesses on the user's machine -- auth is often implicit. The server inherits the user's environment, so it can read tokens from environment variables, config files, or the system keychain. This is fine for personal use.

For remote HTTP+SSE servers, you need real auth. The most common pattern is OAuth 2.0 with the MCP server acting as a resource server. The client obtains a token through the usual OAuth flow and passes it in the Authorization header. The server validates it on each request.

Authorization is trickier. Just because a user is authenticated doesn't mean they should have access to every tool. The pattern I recommend: define scopes per tool, check them on every invocation. A read-only scope shouldn't let someone call a delete_record tool.

Streaming responses

Some tools take a while -- running complex queries, generating reports, processing large files. MCP supports streaming via Server-Sent Events (SSE), and you should use it for anything that takes more than a couple seconds.

The pattern is straightforward. Instead of returning one big result at the end, yield partial results as they become available. For a database query, stream rows as they come back. For a file processing operation, stream progress updates.

The key implementation detail: structure your streaming so each chunk is independently useful. Don't stream half a JSON object. Stream complete status updates, complete rows, or complete sections.

For long-running operations, also implement a cancellation mechanism. If the user moves on, your server should be able to abort the work rather than burning resources on something nobody's waiting for.

Testing MCP servers

Testing MCP servers is awkward because the protocol is inherently interactive. Here's the approach that works.

Unit test your tool implementations directly. Don't test through the MCP protocol layer -- just call the functions that back each tool and verify they return the right data with the right structure.

For integration tests, use the MCP Inspector tool. It lets you connect to your server, list tools, call them with specific inputs, and inspect the responses. Think of it as Postman for MCP.

For end-to-end testing, connect your server to a real LLM client and give it tasks that require your tools. This catches the subtle stuff -- tool descriptions that confuse the model, parameter schemas that lead to malformed inputs, response formats that the model can't parse well.

Write tests for error cases specifically. What happens when the backing service is down? When the user passes invalid parameters? When a tool times out? Models handle errors better when they get clear, structured error messages rather than stack traces.

Deployment patterns

There are two main deployment models, and they serve different purposes.

Local stdio servers run as child processes on the user's machine. The host app (Claude Desktop, Cursor, your custom agent) spawns the server process and communicates over stdin/stdout. This is simple, requires no infrastructure, and works great for personal tools -- filesystem access, local databases, development utilities.

The downside: every user needs the server installed locally, and you can't share state between users.

Remote HTTP+SSE servers run on your infrastructure and serve multiple clients over the network. This is what you want for team tools, production integrations, or anything that needs centralized state.

Deploy these like any other web service -- behind a load balancer, with health checks, with proper logging. The SSE connection is long-lived, so make sure your infrastructure handles that (some load balancers have aggressive idle timeouts that kill SSE connections).

A third pattern is emerging: hybrid servers that run locally but proxy to remote backends. The local process handles the MCP protocol and auth, while the actual work happens on a remote server. This gives you the simplicity of local deployment with the power of remote compute.

What makes a good MCP server

After building several and using dozens, here's what separates the good ones.

Focused scope. The best servers do one thing well. A GitHub server that handles repos, issues, and PRs. A database server that handles queries and schema inspection. Don't try to build a server that does everything.

Helpful descriptions. Models rely heavily on tool descriptions to decide what to call and how. Spend time on these. Include example inputs, edge cases, and constraints.

Predictable errors. Return structured errors with clear messages. "Table 'users' not found in database 'prod'" is useful. "Internal server error" is not.

Reasonable defaults. If your search tool can return thousands of results, default to a sensible limit. If your query tool can be destructive, make read-only the default mode.

The ecosystem trajectory

The MCP ecosystem is growing fast. A year ago, there were maybe a dozen public MCP servers. Now there are hundreds, covering databases, cloud providers, SaaS tools, development utilities, and more.

The pattern looks a lot like the early API economy -- first a rush of basic wrappers, then consolidation around well-built servers, then platforms that aggregate and manage them. We're somewhere between phase one and phase two right now.

If you're building an MCP server, the best advice is to ship something narrow and well-tested rather than something broad and brittle. The servers that get adopted are the ones that work reliably on the first try, not the ones with the longest feature list.