The Claude Code Skill YAML Contract: A Field-by-Field Guide

A Claude Code skill is a markdown file with a YAML frontmatter block on top. The YAML is not decoration \u2014 it is a contract between your skill and the agent runtime. Get a field wrong and the skill either fails to load, never gets discovered, or gets invoked at the wrong moment.

This guide walks every field in the contract, explains the difference between discovery and invocation (the single most-misunderstood mechanic), and lays out the version semantics that nobody talks about until something breaks in production.

The shape of a skill file

Every skill file has the same shape:

---
name: my-skill
description: When to invoke this skill...
---

# My Skill

Body of the skill \u2014 the prompt the agent reads when invoked.

That's it. The frontmatter is YAML between two --- fences. The body is markdown that becomes the system prompt the agent loads when the skill fires. The runtime parses the frontmatter, indexes the skill, and waits for a trigger condition.

The minimum valid skill has exactly two fields: name and description. Everything else is optional but consequential.

`name` \u2014 the invocation key

name: article-writer

name is the slug the user types after the slash: /article-writer. It is also what other agents pass to the Skill tool.

Rules:

kebab-case, lowercase, no spaces
unique per loaded skill set \u2014 a collision means one wins silently
prefix with a plugin namespace if the skill ships inside a plugin (plugin-name:skill-name)

The name is not user-facing display text. Don't put marketing copy here. seo-audit is right; \ud83d\ude80 SEO Auditor Pro is wrong and won't load.

`description` \u2014 the discovery contract

description: When the user wants to audit, review, or diagnose SEO issues on a staging subdomain. Also trigger on "SEO audit", "technical SEO", "why am I not ranking"...

This is the most important field in the file, and the one most people get wrong.

The description is not a summary of what the skill does. It is a prompt fragment the orchestrating agent reads when deciding whether to fire the skill. Write it in the form: when this kind of request appears, fire me.

A description like "Audits SEO" gets ignored 80% of the time because the orchestrator can't pattern-match user intent against a verb. A description that lists trigger phrases ("when the user says 'audit my SEO'", "also trigger on 'why am I not ranking'") fires reliably because it gives the orchestrator concrete strings to anchor on.

The Anthropic skill docs at https://docs.claude.com/en/docs/claude-code recommend the discovery-oriented framing for this exact reason: the description is read at every turn, and it competes with every other skill's description for routing weight.

A practical pattern that works:

description: When the user wants to do X (the primary trigger). Also trigger on "phrase one", "phrase two", "phrase three". Use this whenever Y. For Z, see other-skill instead.

Three trigger phrases, one positive trigger, one negative-redirect to a sibling skill. That structure routes correctly across hundreds of turns.

Discovery vs invocation \u2014 the mental model

Here is the distinction nobody explains clearly: discovery and invocation are two separate phases.

Discovery happens every turn. The orchestrator reads every loaded skill's name + description and decides whether one of them matches the current user message. This is a cheap, in-context decision.
Invocation happens after discovery picks a winner. The runtime loads the skill body (the markdown after the frontmatter) into the agent's context as a system prompt and runs the agent against it.

The implication: your description determines if the skill gets picked. Your body determines what happens after it gets picked.

Most skill failures are discovery failures, not invocation failures. The body is fine \u2014 the agent just never reads it because the description didn't win the routing decision. When you debug a skill that "isn't firing," start with the description, not the body.

Optional fields you'll actually use

Beyond name + description, there are a handful of fields that show up in production skills.

`model`

model: sonnet

Forces the skill to invoke a specific model regardless of the parent agent's model. Values match the model family aliases (opus, sonnet, haiku). Use this when a skill needs a specific capability tier \u2014 e.g. an analytical review skill pinned to Opus, or a fast-classification skill pinned to Haiku.

If you omit model, the skill inherits from the calling context. 90% of skills should omit it.

`version`

version: 2

This is where the field semantics get subtle. version is an integer or semver string the skill author increments when the skill's contract changes \u2014 input shape, expected outputs, side effects. It is NOT a changelog version for typo fixes.

The runtime doesn't enforce version semantics. It is a contract for consumers of the skill \u2014 other agents calling it via the Skill tool, or pipelines parsing its output. When version: 1 \u2192 2, downstream consumers should re-read the skill body before calling it, because the contract may have shifted.

Practical rule: bump version only when an existing caller would break. Adding a new optional input? Same version. Renaming a required input? New version. Changing the JSON shape of structured output? New version.

`allowed-tools` (where supported)

Some skill loaders accept an allowed-tools field that restricts which tools the skill body can call:

allowed-tools: Read, Grep, Glob

This is a defense-in-depth signal \u2014 the skill says "I only need read access" and the runtime enforces it. Use it for skills that should never write files or shell out, especially if the skill is going to be loaded by less-trusted callers.

Input shape \u2014 there is no schema field

A common question: "Where do I declare what arguments my skill takes?"

There is no formal inputs schema in the YAML contract. Skills receive a free-form args string when invoked via the Skill tool, and a free-form user message when invoked via slash command. The skill body is responsible for parsing what it got.

The convention that works: declare the input shape inside the body, in a section the agent reads early.

## Required Inputs

The caller passes these as part of the user context:

- **NICHE** \u2014 one of the registered subdomain slugs
- **SLUG** \u2014 kebab-case identifier
- **KEYWORD** \u2014 search intent target

If any are missing, fail with a request-for-info.

The body's ## Required Inputs section IS the schema. The agent reads it, parses the user context against it, and asks for missing fields. This is more flexible than a rigid JSON Schema because the agent can negotiate ("you didn't pass KEYWORD, but I see SLUG \u2014 should I derive one?") in a way a strict schema can't.

A working example

Here is a minimal skill that demonstrates every concept:

---
name: csv-summarizer
description: When the user wants a quick numeric summary of a CSV file. Also trigger on "summarize this CSV", "what's in this data", "describe this dataset". Use this for one-shot summaries; for full data analysis pipelines, see data-pipeline.
model: haiku
version: 2
---

# CSV Summarizer

You summarize CSV files in 5 lines or fewer.

## Required Inputs

- **PATH** \u2014 absolute path to the CSV file

If PATH is missing, ask for it before reading anything.

## Procedure

1. Read the CSV at PATH
2. Output: row count, column count, column names, dtype guesses, one-line semantic guess
3. Stop. Do not produce additional analysis unless asked.

name is the slash command. description lists three trigger phrases plus a redirect to a sibling skill. model: haiku because summarization is a fast-classification task and Opus would be overkill, roughly 3\u00d7 faster on Haiku for this workload. version: 2 because v1 didn't include the dtype guess and downstream consumers may want to know the contract changed. The body declares its input shape in prose.

Versioning trade-offs: integer vs semver

You'll see both in the wild. The trade-off:

Integer (version: 2) \u2014 simple, monotonic, no ambiguity about ordering. Right when the skill is internal and you control all callers.
Semver (version: 2.1.0) \u2014 communicates patch vs minor vs major changes, useful when the skill is published in a marketplace and consumers need to pin to a major version. Right when the skill is published externally.

For skills inside a single team or repo, integer wins on simplicity. For skills published to a public registry, semver wins on consumer ergonomics. Don't mix the two within one skill's history \u2014 pick once and stay.

What breaks when the contract is wrong

A few failure modes to watch for:

Missing description \u2014 skill loads but never fires, because there's nothing for discovery to match against
Description that summarizes instead of triggers \u2014 fires inconsistently, often only when the user types the skill name explicitly
name collision with a built-in command \u2014 silently shadowed; built-in usually wins
YAML parse error in frontmatter \u2014 entire skill file fails to load with no graceful warning; check --debug output to spot it
Version bump without updating callers \u2014 downstream pipelines parse old shape and silently produce wrong output

The first three are discovery problems. The last two are invocation problems. Fix them with different mental models.

Closing

The skill YAML contract is small but load-bearing. Two required fields, three or four optional ones, and a body that doubles as the input schema. Understanding discovery vs invocation is what separates a skill that fires reliably from one that the agent ignores.

When a new skill misbehaves, the first question to ask is: did the orchestrator pick it? If yes, debug the body. If no, debug the description. That triage saves hours.

References: