Claude Code Skill YAML: The Four-Field Frontmatter Contract

A Claude Code skill is one Markdown file. Strip away the body and what remains is a YAML frontmatter block that decides whether the agent ever loads your instructions in the first place. Get the four fields right and the skill fires when it should. Get them sloppy and the LLM either ignores the skill, runs it on the wrong prompts, or hands it a tool surface it should never have touched.

The contract has exactly four fields you will touch in practice: name, description, allowed-tools, and model. Each one answers a different question the runtime asks before invocation. Treating them as one undifferentiated YAML blob is the usual cause of "why isn't my skill triggering" tickets.

Why frontmatter beats prose-only skills

Older agent patterns expected you to dump everything into a system prompt and let the model figure it out. That approach scales badly past three or four skills. A registry of frontmatter-tagged Markdown files lets the runtime do cheap selection (read 200 tokens of frontmatter per skill) before paying for the expensive invocation (load the full body). On a project with 40 skills, a 200-token-per-skill scan costs roughly 8K tokens of context, while loading every body would cost 80K-plus and crowd out the actual conversation.

The frontmatter is the cheap part. Spend disproportionate effort on it.

The `name` field

name is the skill's stable identifier. It appears in slash-command form (/your-skill-name), in the skill's filesystem path, and in any cross-references. Treat it like a public API: rename it and you break every prompt, doc, and CI script that referenced the old slug.

Constraints:

Lowercase, kebab-case (claude-skill-yaml-frontmatter-contracts, not ClaudeSkillYAMLFrontmatterContracts)
Match the filename or directory name exactly. The runtime cross-checks both, and a mismatch silently makes the skill uninvokable
Keep it under 40 characters. Longer slugs survive but make CLI invocation painful

name: copy-editing

Avoid clever names. polish-prose reads nicely but a six-month-future you will type /copy and expect autocomplete to find it. Predictable beats memorable.

The `description` field — where most skills die

description is the field the LLM actually reads when deciding whether to fire your skill. The runtime presents every available skill's description to the model on every turn. If your description is vague, the model picks the wrong skill or none at all.

The pattern that works: lead with the trigger condition, then list trigger keywords, then state what's out of scope.

description: |
  Use this skill when the user wants to edit, review, or improve existing
  marketing copy. Trigger on phrases like "edit this copy", "polish this",
  "tighten this up", "too wordy", "this reads awkwardly". For writing new
  copy from scratch, use copywriting instead. For SEO-focused audits, use
  seo-audit.

Three structural elements pull the LLM's selection accuracy up:

Verb-first trigger sentence — "Use this skill when…" gives the model a clear conditional
Keyword exemplars in quotes — the model pattern-matches user phrasing against these
Negative scoping — naming sibling skills and what they cover prevents collisions

Skills with one-line descriptions (description: "edits copy") misfire roughly 3× more often than skills with multi-line scoped descriptions in informal benchmarks across a registry of around 25 skills. The token budget for the frontmatter is generous; spend it.

The `allowed-tools` field as a security boundary

allowed-tools declares the tool surface the skill can call. It is both a capability hint to the model and a hard fence at runtime. Omit the field and the skill inherits the parent agent's full toolset, which is usually wrong: a copy-editing skill should not be running shell commands.

allowed-tools:
  - Read
  - Edit
  - Grep
  - Glob

Three patterns to know:

| Pattern | When to use | |---|---| | Read-only set: Read, Grep, Glob | Audit, review, analysis skills | | Read + Edit: above plus Edit, Write | Refactoring, writing, doc-generation skills | | Full surface (omit field) | General-purpose skills you trust completely |

The principle of least privilege wins here just like in regular software. A skill that lints a config file does not need shell access; locking it down means a prompt-injection attack that targets the skill cannot escalate to running arbitrary commands.

For network-touching tools, be explicit:

allowed-tools:
  - Read
  - Edit
  - WebFetch
  - mcp__github__create_pull_request

MCP-prefixed tools follow the mcp__<server>__<tool> convention. Naming them individually rather than wildcarding mcp__github__* is good hygiene — wildcards drift as the upstream MCP server adds new tools you never reviewed.

The `model` field for cost and latency tuning

model is optional. Set it to override the parent agent's default model for this specific skill. Use it for two reasons: cost reduction on cheap operations, and latency reduction on user-facing skills.

model: haiku

A skill that lints YAML or runs a quick grep does not need Sonnet or Opus reasoning. Routing it through Haiku cuts the per-invocation cost by roughly 80% on input tokens compared to Sonnet, and shaves about 40% off response latency. For a skill that fires 50 times a day, that compounds.

The reverse case: a skill that does deep code review or multi-file refactoring should not be downgraded to Haiku just because its parent agent runs there. Override upward to Sonnet or Opus for those.

When in doubt, omit the field and inherit. Premature model-pinning is a real cause of pipeline weirdness — a skill explicitly tagged model: haiku can quietly degrade output quality six months later when the surrounding system has been upgraded to Sonnet 4.6.

A complete worked example

Putting all four fields together for a hypothetical PR-review skill:

---
name: pr-review
description: |
  Use this skill when the user asks for a review of a pull request, code
  review, or PR feedback. Trigger on "review this PR", "code review",
  "review the diff", "PR feedback", "look at this change". Pulls the diff,
  reads context, and returns inline comments. For pre-merge SEO checks
  on content PRs, use seo-audit instead.
allowed-tools:
  - Read
  - Grep
  - Glob
  - Bash(git:*)
  - mcp__github__get_pull_request
  - mcp__github__create_pull_request_review
model: sonnet
---

Read it as four answers:

name: invoked as /pr-review
description: model picks this skill when the user asks for PR review, but defers to seo-audit for content PRs
allowed-tools: read code, grep it, run scoped git commands, talk to GitHub MCP. No write to disk, no arbitrary shell.
model: pinned to Sonnet because review needs reasoning that Haiku struggles with on multi-file diffs

Pitfalls worth memorizing

A few failure modes appear over and over in skill repos:

Description without trigger words — "Reviews pull requests" reads like documentation but gives the model nothing to match against. Add the explicit phrases users say.
allowed-tools drift — adding a WebFetch call in the body without updating the frontmatter. Runtime denies the call and the skill silently falls back, producing confusing output.
Filename / name mismatch — name: pr-review in prReview.md will not load. Filenames are part of the contract.
Multi-line description without | block scalar — YAML folds without the explicit block indicator. Either use | for literal multi-line or single-line your description.
Pinning model to a deprecated identifier — model: claude-3-opus-20240229 may work today and fail silently after a deprecation window. Prefer family aliases (opus, sonnet, haiku) when the runtime supports them.

Tightening the loop

A useful workflow once you have more than a handful of skills: write a small script that walks ~/.claude/skills/ (or wherever your skills live), parses the frontmatter, and reports any skill missing a structured description, missing allowed-tools, or pointing at an unknown model. Catching the contract violation at lint time is roughly 10× cheaper than discovering it through a misfired skill in production.

The four fields look like trivial metadata. They are the difference between a skill registry that the model navigates accurately and one that turns into noise the LLM learns to ignore.

References:

https://docs.anthropic.com/en/docs/claude-code/overview
https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/skills
https://github.com/anthropics/anthropic-cookbook

Claude Code Skill YAML: The Four-Field Frontmatter Contract

Why frontmatter beats prose-only skills

The name field

The description field — where most skills die

The allowed-tools field as a security boundary

The model field for cost and latency tuning