claudeplugins.
claudeplugins6 min read

MCP Tool Registration: Multi-Tool Stdio Servers for Claude

Register multiple tools on a single MCP stdio server with strict input schemas, typed outputs, and a clean dispatch pattern Claude can call reliably.

MCP Tool Registration: Multi-Tool Stdio Servers for Claude

A Model Context Protocol server is just a process that speaks JSON-RPC over stdio and answers two questions: what tools do you offer? and run this one with these arguments. Registration is where most servers go wrong. Schemas drift from the actual handler signature, dispatch becomes a 200-line if/elif chain, and the LLM starts hallucinating arguments because the JSON Schema lied about what was required.

This piece walks through registering several tools on one stdio server, the input/output contract Claude actually consumes, and a dispatch pattern that scales past a dozen tools without turning into a tarpit.

One server, many tools \u2014 the registration shape

The MCP spec (modelcontextprotocol.io) defines tools/list and tools/call as the two RPC methods every tool-providing server must implement. tools/list returns an array of tool descriptors; tools/call takes a name plus an arguments object and returns content blocks.

Here is the minimum viable registration in Python using the official SDK:

from mcp.server import Server
from mcp.types import Tool, TextContent
import mcp.server.stdio

app = Server("toolbox")

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="search_docs",
            description="Search indexed documentation by query string.",
            inputSchema={
                "type": "object",
                "properties": {
                    "query": {"type": "string", "minLength": 1},
                    "limit": {"type": "integer", "minimum": 1, "maximum": 50, "default": 10},
                },
                "required": ["query"],
            },
        ),
        Tool(
            name="fetch_url",
            description="Fetch the readable text content of a public URL.",
            inputSchema={
                "type": "object",
                "properties": {
                    "url": {"type": "string", "format": "uri"},
                },
                "required": ["url"],
            },
        ),
    ]

Two details matter more than they look. The description is what Claude reads when deciding whether to call the tool, so vague text ("does stuff with docs") leads to under-invocation. And inputSchema is the contract Claude enforces before calling \u2014 if you mark query required and Claude only has a URL, the call is skipped silently rather than dispatched with a missing field.

The input + output contract

tools/call returns a list of Content blocks. For most tools you want a single TextContent with structured JSON inside, because Claude parses JSON better than prose:

import json

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "search_docs":
        results = await search_index(arguments["query"], arguments.get("limit", 10))
        payload = {"results": results, "count": len(results)}
        return [TextContent(type="text", text=json.dumps(payload))]
    raise ValueError(f"unknown tool: {name}")

The output side has no schema enforcement in the spec \u2014 Claude just gets whatever string you return. Three patterns work in practice:

  1. Plain prose \u2014 fine for read_file style tools where the content is the answer
  2. JSON-in-text \u2014 best for structured results (search hits, query results, lists)
  3. Multiple content blocks \u2014 useful when you want to mix narrative + structured data, e.g. one TextContent summary + one TextContent JSON payload

Pick prose over JSON only when the content has no internal structure worth preserving. A search result list always wants JSON; a rendered markdown file does not.

Dispatch: the if/elif smell

The naive multi-tool dispatcher grows like this:

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "search_docs":
        return await handle_search(arguments)
    elif name == "fetch_url":
        return await handle_fetch(arguments)
    elif name == "summarize":
        return await handle_summarize(arguments)
    elif name == "translate":
        return await handle_translate(arguments)
    # ... twelve more elifs ...
    raise ValueError(f"unknown tool: {name}")

By tool eight this is unreadable, and every new tool requires editing two functions (the registry and the dispatcher) that can drift out of sync. The fix is a single registry of ToolSpec records that owns both schema and handler:

from dataclasses import dataclass
from typing import Awaitable, Callable

Handler = Callable[[dict], Awaitable[list[TextContent]]]

@dataclass(frozen=True)
class ToolSpec:
    name: str
    description: str
    input_schema: dict
    handler: Handler

TOOLS: dict[str, ToolSpec] = {
    "search_docs": ToolSpec(
        name="search_docs",
        description="Search indexed documentation by query string.",
        input_schema={
            "type": "object",
            "properties": {"query": {"type": "string"}, "limit": {"type": "integer"}},
            "required": ["query"],
        },
        handler=handle_search,
    ),
    "fetch_url": ToolSpec(
        name="fetch_url",
        description="Fetch the readable text content of a public URL.",
        input_schema={
            "type": "object",
            "properties": {"url": {"type": "string", "format": "uri"}},
            "required": ["url"],
        },
        handler=handle_fetch,
    ),
}

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(name=s.name, description=s.description, inputSchema=s.input_schema)
        for s in TOOLS.values()
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    spec = TOOLS.get(name)
    if spec is None:
        raise ValueError(f"unknown tool: {name}")
    return await spec.handler(arguments)

Adding tool number nine is now a single dict entry. The registry is the only place schema and handler can drift, and a quick unit test (for name in TOOLS: assert TOOLS[name].name == name) catches the one mistake people actually make.

Schema discipline pays off in token cost

There is a measurable reason to keep schemas tight. Claude reads every tool's full JSON Schema on every turn that exposes those tools \u2014 Anthropic's tool-use docs (docs.claude.com/en/docs/agents-and-tools/tool-use) describe this as part of the system prompt overhead. A 12-tool server with 200-token schemas costs you 2400 tokens per turn before the user even types. Trimming property descriptions, removing rarely-used optional fields, and keeping enum lists short can drop that to 800 tokens \u2014 roughly 3\u00d7 cheaper per turn at scale.

Concretely: prefer "type": "string" with a tight description over {"type": "string", "enum": [...30 items...]} unless the enum is genuinely closed. Use default to make optional parameters truly optional; Claude will skip them unless the description hints they matter.

Wiring it to Claude Code

A registered MCP server reaches Claude Code through a single config entry. The claude mcp add CLI (docs.claude.com/en/docs/claude-code/mcp) handles this:

claude mcp add toolbox \
  --scope user \
  --command "uvx your-toolbox-server"

Scope matters. user makes the server available across every Claude Code session for the current OS user; project scopes it to a single repo via .claude/settings.local.json. For tools that touch the filesystem of one specific project, project scope over user scope every time \u2014 otherwise you end up with a search-docs tool that points at the wrong index from inside a different repo.

After registration, claude mcp list should show the tool count parsed from your tools/list response. If the count is zero, the most common cause is the handler raising an exception during startup (which the stdio transport swallows). Wrap the handler body in a top-level try/except that logs to a file (stderr is captured by Claude Code but not always shown), then check the log.

When a single server should split

One server with twelve loosely-related tools is harder for Claude to reason about than two servers with six tightly-related tools each. The decision boundary is: do these tools share state (a database connection, an API client, an auth token)? If yes, one server. If no, two servers \u2014 Claude gets cleaner intent signals from focused descriptions, and you get separate process isolation for crashes.

References: