MCP Tool Registration: Multi-Tool Stdio Servers for Claude

A Model Context Protocol server is just a process that speaks JSON-RPC over stdio and answers two questions: what tools do you offer? and run this one with these arguments. Registration is where most servers go wrong. Schemas drift from the actual handler signature, dispatch becomes a 200-line if/elif chain, and the LLM starts hallucinating arguments because the JSON Schema lied about what was required.

This piece walks through registering several tools on one stdio server, the input/output contract Claude actually consumes, and a dispatch pattern that scales past a dozen tools without turning into a tarpit.

One server, many tools — the registration shape

The MCP spec (modelcontextprotocol.io) defines tools/list and tools/call as the two RPC methods every tool-providing server must implement. tools/list returns an array of tool descriptors; tools/call takes a name plus an arguments object and returns content blocks.

Here is the minimum viable registration in Python using the official SDK:

from mcp.server import Server
from mcp.types import Tool, TextContent
import mcp.server.stdio

app = Server("toolbox")

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="search_docs",
            description="Search indexed documentation by query string.",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {"type": "string", "minLength": 1},
                    "limit": {"type": "integer", "minimum": 1, "maximum": 50, "default": 10},
                },
                "required": ["query"],
            },
        ),
        Tool(
            name="fetch_url",
            description="Fetch the readable text content of a public URL.",
            inputSchema={
                "type": "object",
                "properties": {
                    "url": {"type": "string", "format": "uri"},
                },
                "required": ["url"],
            },
        ),
    ]

Two details matter more than they look. The description is what Claude reads when deciding whether to call the tool, so vague text ("does stuff with docs") leads to under-invocation. And inputSchema is the contract Claude enforces before calling — if you mark query required and Claude only has a URL, the call is skipped silently rather than dispatched with a missing field.

The input + output contract

tools/call returns a list of Content blocks. For most tools you want a single TextContent with structured JSON inside, because Claude parses JSON better than prose:

import json

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "search_docs":
        results = await search_index(arguments["query"], arguments.get("limit", 10))
        payload = {"results": results, "count": len(results)}
        return [TextContent(type="text", text=json.dumps(payload))]
    raise ValueError(f"unknown tool: {name}")

The output side has no schema enforcement in the spec — Claude just gets whatever string you return. Three patterns work in practice:

Plain prose — fine for read_file style tools where the content is the answer
JSON-in-text — best for structured results (search hits, query results, lists)
Multiple content blocks — useful when you want to mix narrative + structured data, e.g. one TextContent summary + one TextContent JSON payload

Pick prose over JSON only when the content has no internal structure worth preserving. A search result list always wants JSON; a rendered markdown file does not.

Dispatch: the `if/elif` smell

The naive multi-tool dispatcher grows like this:

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "search_docs":
        return await handle_search(arguments)
    elif name == "fetch_url":
        return await handle_fetch(arguments)
    elif name == "summarize":
        return await handle_summarize(arguments)
    elif name == "translate":
        return await handle_translate(arguments)
    # ... twelve more elifs ...
    raise ValueError(f"unknown tool: {name}")

By tool eight this is unreadable, and every new tool requires editing two functions (the registry and the dispatcher) that can drift out of sync. The fix is a single registry of ToolSpec records that owns both schema and handler:

from dataclasses import dataclass
from typing import Awaitable, Callable

Handler = Callable[[dict], Awaitable[list[TextContent]]]

@dataclass(frozen=True)
class ToolSpec:
    name: str
    description: str
    input_schema: dict
    handler: Handler

TOOLS: dict[str, ToolSpec] = {
    "search_docs": ToolSpec(
        name="search_docs",
        description="Search indexed documentation by query string.",
        input_schema={
            "type": "object",
            "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}},
            "required": ["query"],
        },
        handler=handle_search,
    ),
    "fetch_url": ToolSpec(
        name="fetch_url",
        description="Fetch the readable text content of a public URL.",
        input_schema={
            "type": "object",
            "properties": {"url": {"type": "string", "format": "uri"}},
            "required": ["url"],
        },
        handler=handle_fetch,
    ),
}

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(name=s.name, description=s.description, inputSchema=s.input_schema)
        for s in TOOLS.values()
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    spec = TOOLS.get(name)
    if spec is None:
        raise ValueError(f"unknown tool: {name}")
    return await spec.handler(arguments)

Adding tool number nine is now a single dict entry. The registry is the only place schema and handler can drift, and a quick unit test (for name in TOOLS: assert TOOLS[name].name == name) catches the one mistake people actually make.

Schema discipline pays off in token cost

There is a measurable reason to keep schemas tight. Claude reads every tool's full JSON Schema on every turn that exposes those tools — Anthropic's tool-use docs (docs.claude.com/en/docs/agents-and-tools/tool-use) describe this as part of the system prompt overhead. A 12-tool server with 200-token schemas costs you 2400 tokens per turn before the user even types. Trimming property descriptions, removing rarely-used optional fields, and keeping enum lists short can drop that to 800 tokens — roughly 3× cheaper per turn at scale.

Concretely: prefer "type": "string" with a tight description over {"type": "string", "enum": [...30 items...]} unless the enum is genuinely closed. Use default to make optional parameters truly optional; Claude will skip them unless the description hints they matter.

Wiring it to Claude Code

A registered MCP server reaches Claude Code through a single config entry. The claude mcp add CLI (docs.claude.com/en/docs/claude-code/mcp) handles this:

claude mcp add toolbox \
  --scope user \
  --command "uvx your-toolbox-server"

Scope matters. user makes the server available across every Claude Code session for the current OS user; project scopes it to a single repo via .claude/settings.local.json. For tools that touch the filesystem of one specific project, project scope over user scope every time — otherwise you end up with a search-docs tool that points at the wrong index from inside a different repo.

After registration, claude mcp list should show the tool count parsed from your tools/list response. If the count is zero, the most common cause is the handler raising an exception during startup (which the stdio transport swallows). Wrap the handler body in a top-level try/except that logs to a file (stderr is captured by Claude Code but not always shown), then check the log.

When a single server should split

One server with twelve loosely-related tools is harder for Claude to reason about than two servers with six tightly-related tools each. The decision boundary is: do these tools share state (a database connection, an API client, an auth token)? If yes, one server. If no, two servers — Claude gets cleaner intent signals from focused descriptions, and you get separate process isolation for crashes.

References: