MCP vs Custom API Integration: When Each One Actually Wins
Every team I talk to that has shipped one Claude integration asks the same question within a month: should this tool be an MCP server, or should it stay as a tool definition inside our app? The answer gets framed as a technology debate, but it’s really a question about how many places you plan to use the same capability.
Here is the short version. For about 90% of teams, a custom API integration written directly into your Claude code is the right call. The Model Context Protocol is the right call when you need the same tool surface across multiple LLM clients, when you are building a reusable internal platform, or when you are shipping tools for other people to hit from their own assistants. The rest of this guide walks through why, with the cost model and a decision tree at the end.
I run both patterns in production. My Telegram bot was a custom integration for months before I turned the task layer into a proper MCP server, once I started using Claude Code against the same tasks. I’ll use those two systems as the running examples.
Verdict up front
Pick a custom API integration when you have one app, one surface, and the tools are tightly coupled to that app’s state. You’ll ship faster, debug faster, and carry less operational weight.
Pick an MCP server when at least one of these is true:
- You want two or more LLM clients (Claude Desktop, Claude Code, Cursor, a custom agent) to use the same tools
- You are exposing capabilities to teammates or customers so they can call them from their own LLM client
- You are building a reusable internal platform where tool evolution needs to be decoupled from app deploys
- You are selling a SaaS product and want customers to call it from any MCP-aware client
Everything else is a middle ground, and the middle ground almost always starts as custom and extracts to MCP later.
What MCP actually gives you
The Model Context Protocol is a JSON-RPC based specification for how an LLM client (host) talks to a tool provider (server). The protocol defines how the client discovers available tools, how it invokes them, how the server returns structured results, and how resources and prompts get surfaced. A server can run as a local stdio process the client launches, or as a remote SSE/HTTP endpoint.
The value isn’t the wire format. The value is that once your capability is an MCP server, any MCP-aware client can use it without you shipping a new integration. Claude Desktop, Claude Code, Cursor, and a growing list of third-party clients all speak the same protocol, so the same server answers all of them. If you want a deeper walkthrough of the protocol and how tool discovery actually works, I wrote one in MCP servers explained. For the implementation side, build an MCP server in TypeScript covers the minimal skeleton.
What custom integration gives you
A custom API integration means you define your tools directly in your Claude SDK call, implement the handlers in the same codebase, and call the Anthropic API yourself. No separate process, no protocol layer, no discovery handshake. The tool schema is a JSON schema in your app, the handler is a function in your app, and the results go back in the same request/response cycle.
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const tools = [
{
name: "get_task",
description: "Fetch a task by ID",
input_schema: {
type: "object",
properties: { id: { type: "string" } },
required: ["id"],
},
},
];
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
tools,
tool_choice: { type: "auto" },
messages: [{ role: "user", content: "What's on task 42?" }],
});
That’s the whole integration. The handler for get_task is a normal function in the same codebase. If you want the full tool-use flow including multi-turn handling, I covered it in the Claude API tool use guide.
This is the pattern I use for my Telegram bot’s business logic and for the Graffiti customer profiling agent. One process, one deploy, normal logs.
Side by side
| Dimension | Custom integration | MCP server |
|---|---|---|
| Portability across LLM clients | Locked to your app | Any MCP client works |
| Engineering cost (initial) | Low, hours | Higher, 1-2 days of protocol plumbing |
| Engineering cost (long-term, reused) | Linear per new client | Amortized after ~3 integrations |
| Debuggability | Single process, normal logs | Cross-process, needs client and server logs |
| Auth and security | Inherits your app’s auth | You build a new auth layer |
| Testing | Normal test tooling | Needs protocol-aware harness or contract tests |
| Third-party ecosystem | None | Growing marketplace of prebuilt servers |
| Latency | One fewer IPC hop | Extra roundtrip per call |
| Version management | Coupled to app deploys | Decoupled, pro and con |
| Stateful tools | Natural, share app state | Awkward, needs explicit state transport |
The rows that matter most in practice are the first three. Everything else follows from them.
When MCP is worth the overhead
There is one question that decides this: how many distinct LLM clients will call these tools? If the answer is one, stay custom. If the answer is two or more now, or clearly will be within six months, start with MCP.
Concrete situations where I’d reach for MCP:
You have two or more Claude surfaces. If your team uses Claude Desktop for research, Claude Code for engineering, and a custom Slack bot for ops, and all three need the same internal tools (search your task tracker, query your analytics, post to your Notion), MCP pays for itself quickly. Each new client is a config entry, not a new integration project.
You want other people to invoke your tools through their own client. A senior engineer wants to use your internal tools from Cursor. A PM wants them from Claude Desktop. An MCP server means you don’t care which client they use. You publish the server, they add it to their config.
You are building a reusable SDK for an external platform. If you sell a SaaS and want customers to hit it from any LLM, an MCP server is the right shape. One server, many customer clients, versioned like an API.
Long-lived integration where decoupling matters. If the tools change on a different rhythm than the app, or several teams need to evolve them independently, the process boundary is useful. Deploy the server separately, version it separately, roll back separately.
When custom integration wins
Custom integration wins on every dimension that isn’t portability. If portability isn’t a requirement, the other wins compound fast.
Keep it custom when:
- You have one app and one Claude usage surface, and that’s not changing
- Tool latency matters and you want to avoid an extra IPC hop per call
- Your tools are stateful and closely coupled to app state (mid-request session data, in-memory caches, open DB transactions)
- You already have a mature RPC layer and auth story in your app, and MCP would mean rebuilding that perimeter
- You are early in the product and don’t know yet which tools are worth keeping
The last one is the most common trap. Teams reach for MCP at the “let’s do it right” phase, before they know which tools will survive the first three iterations. Most tools won’t. Build them in-app first, see which ones actually earn their keep, then promote the survivors.
The hybrid pattern
The pattern that keeps working for me: build core tools as a custom integration first, then extract to MCP once you hit the second client. The extraction is usually smaller than people expect if you structured the handlers well.
The trick is to write your custom tool handlers as pure functions from input to output from day one. Don’t bury them inside route handlers, don’t let them grab state off this. Each handler should take a typed input, return a typed output, and keep side effects in an injectable service. When you’re ready to extract, the MCP server becomes a thin adapter over those same functions.
For the step-by-step on the TypeScript side, building an MCP server in TypeScript shows the minimal adapter layer you need once your handlers are factored this way.
Real scenarios from my stack
Scenario 1: internal task manager. I run a task system across TickTick with custom scoring, semantic search over my history, and project sync. For a long time this lived inside a Telegram bot as a custom Claude integration. One process, tools defined in the same file that calls messages.create. Fine for one surface.
When I started using Claude Code on the same tasks (morning planning, pruning stale items, generating weekly reports), I had two options: reimplement the tool definitions inside a Claude Code agent, or extract them once and share. I extracted. The TickTick MCP server now runs as a systemd service on the same VPS, and both the Telegram bot and Claude Code hit the same server. Total refactor cost was around two days, and the value paid back the first time I added a third client (a cron job that runs claude -p with the same tool surface).
If you want the agent-side of this story, Claude Code SDK agents covers how a Claude Code client consumes MCP tools.
Scenario 2: customer-facing support agent. A single product, single surface, single client. Latency matters (the user is waiting), the tools are coupled to the product’s session state, and nobody else is ever going to call these tools. This one stays custom forever. Adding MCP would buy nothing and cost a deploy, an auth layer, and an extra hop per call.
The pattern is: if the tool surface is internal and multi-client, lean MCP. If the tool surface is external-facing and single-client, lean custom.
MCP server hosting: stdio vs remote
Two deployment modes, very different operational stories.
Stdio. The client launches the server as a child process and talks over stdin/stdout. Config is a JSON entry in the client, no network, no auth layer. This is the default for local dev tools like Claude Desktop and Claude Code. It’s the simplest possible deployment: ship a binary or a Node script, reference it in the client config, done.
SSE/HTTP (remote). The server runs somewhere reachable and the client connects over HTTP with Server-Sent Events for streaming. You now need authentication (tokens, OAuth, whatever fits), TLS termination, and an operational surface for the server (logging, metrics, restart policy). This is what you want when multiple users need to hit the same server, or when the server needs resources (GPU, proprietary data) that can’t ship to every client.
If you’re building for a team, remote is usually worth it. If you’re building for yourself, stdio is fine until you aren’t the only user.
Cost modeling
Rough numbers from what I’ve shipped:
- Custom integration, per tool: 4 to 8 hours initial, then linear. A new tool is schema plus handler plus a few tests. No protocol overhead.
- MCP server, initial overhead: 1 to 2 days. Protocol plumbing, transport choice, deploy story, client config docs. This is paid once.
- MCP server, per tool: similar to custom after the initial overhead is paid. The schema shape is close enough that you can share most of it.
- Adding the Nth client: custom is a new integration project (hours to days). MCP is a config entry (minutes).
Break-even is roughly three clients, or six months of reuse with two clients. Below that, custom wins. Above that, MCP wins. This matches the general shape of the build vs buy tradeoff for AI agents, where infrastructure investment only pays off if you actually reuse it.
One caveat: “number of clients” here means distinct LLM surfaces, not distinct users. One Claude Code instance used by 50 engineers is one client. One Claude Desktop plus one custom agent is two.
Which should you choose?
Walk this tree top to bottom and stop at the first yes.
- Do you have, or will you have within six months, two or more LLM clients calling the same tools? Yes > MCP. No > continue.
- Are you exposing these tools to people outside your codebase (teammates, customers) who will use their own LLM client? Yes > MCP. No > continue.
- Are these tools a product in themselves, sold or shared as a reusable platform? Yes > MCP. No > continue.
- Are your tools tightly coupled to in-process state, or is latency critical? Yes > custom, and don’t look back. No > continue.
- Default: custom, with handlers factored as pure functions so you can extract later.
If you end up at “custom” and later hit case 1 or 2, the extraction is real work but not dramatic. The teams that get burned are the ones who reach for MCP on day one because it sounds more professional, then spend a week on protocol and auth before shipping their first tool.
Pick based on how many surfaces will use the tools, not based on which option sounds more serious. Most teams don’t need MCP yet. The ones that do usually know it without asking.