Tutorial

AI Skills in Practice: Anatomy of a Skill

A skill is not a long prompt. It is a structured workflow with a trigger, a prompt body, references, and an output contract. This post breaks down each component, shows how they interact, and explains the design decisions that separate skills that work from skills that frustrate.

Tin Dang avatar
Tin Dang
Exploded diagram showing the components of an AI skill — trigger, prompt, references, output

In Part 1 we defined what a skill is. In Part 2 we built the context layer that every skill depends on. Now we open the hood and look at how a skill is actually constructed.

A skill has four components. Miss any one of them and the skill degrades from a reliable workflow to a fancy prompt that sometimes works.

The Four Components

Every effective skill, regardless of which AI tool you use, has these pieces:

┌─────────────────────────────────────────┐
│ SKILL │
│ │
│ ┌─────────┐ When does this activate? │
│ │ Trigger │ │
│ └─────────┘ │
│ │
│ ┌─────────┐ What does the AI do? │
│ │ Prompt │ │
│ └─────────┘ │
│ │
│ ┌────────────┐ What knowledge does │
│ │ References │ it need? │
│ └────────────┘ │
│ │
│ ┌────────┐ What should the output │
│ │ Output │ look like? │
│ └────────┘ │
└─────────────────────────────────────────┘

Let us examine each one.

Component 1: The Trigger

The trigger defines when a skill activates. It answers: “Under what circumstances should this workflow run instead of a general-purpose response?”

Triggers come in three forms:

Explicit triggers

The user directly invokes the skill by name or command.

/review → activates the code review skill
/blog-writer → activates the blog writing skill
/debug → activates the debugging skill

Explicit triggers are unambiguous. The user knows what they are asking for. The AI knows what to do. Use explicit triggers for skills that represent distinct, named workflows.

Pattern triggers

The skill activates when the user’s request matches a described pattern.

trigger: "When the user asks to review code, audit a pull request,
or check code quality"

Pattern triggers are more flexible but less predictable. The AI must interpret whether the user’s request matches the pattern. Write patterns that are specific enough to avoid false positives but broad enough to catch natural variations.

Good pattern: “When the user asks to review code changes, audit a PR, or check for security issues in a diff”

Bad pattern: “When the user talks about code” — this matches almost everything and will hijack normal conversations.

Contextual triggers

The skill activates based on what the AI observes in the environment, not what the user says.

trigger: "When working with files in src/content/blog/ and the user
creates or modifies an MDX file"

Contextual triggers are powerful but must be used carefully. A skill that activates every time you touch a certain directory can become annoying quickly. Reserve contextual triggers for guardrails (preventing known mistakes) rather than workflows (doing complex tasks).

Trigger Design Principles

  1. One skill, one trigger scope. A skill that triggers on “code review, deployment, and documentation” is three skills pretending to be one. Split them.
  2. Triggers should be mutually exclusive. If two skills can activate on the same input, the AI must choose — and it will sometimes choose wrong. Design triggers so they do not overlap.
  3. Make the trigger visible. Users should be able to discover what skills exist and when they activate. A skill with a hidden contextual trigger that silently changes behavior is a debugging nightmare.

Component 2: The Prompt

The prompt is the core of the skill — the instructions that tell the AI what to do, step by step. This is where most people go wrong, because they write prompts the way they write chat messages: vague, narrative, and implicit.

A skill prompt is not a conversation. It is a specification.

Structure of an Effective Prompt

## Role
You are a code reviewer focused on production readiness.
## Workflow
1. Read all changed files in the diff
2. For each file, evaluate against these dimensions:
- Correctness: Does the logic match the stated intent?
- Security: Any injection, auth bypass, or data exposure risks?
- Performance: Unnecessary allocations, N+1 queries, missing indexes?
- Readability: Clear naming, reasonable function length?
- Test coverage: Are edge cases tested?
3. Classify each finding by severity: CRITICAL, WARNING, NOTE
4. Determine overall verdict: APPROVE, REQUEST_CHANGES, or BLOCK
## Rules
- BLOCK if any CRITICAL finding exists
- REQUEST_CHANGES if more than 2 WARNINGs
- APPROVE otherwise
- Never suggest style changes that are not in the coding standards
- Do not comment on code you did not review

Notice what this prompt does that a chat message would not:

  • Defines a role — scopes the AI’s perspective
  • Specifies a sequence — steps happen in order, not all at once
  • Lists dimensions — ensures nothing gets forgotten
  • Sets decision criteria — removes ambiguity from the verdict
  • Includes negative constraints — prevents common failure modes

The Prompt Ladder

Think of prompt quality as a ladder with four rungs:

Rung 1 — Vague request: “Review this code.” The AI does whatever it thinks “review” means. Different every time.

Rung 2 — Specific request: “Review this code for security vulnerabilities and performance issues.” Better focus, but no structure, no severity levels, no output format.

Rung 3 — Structured request: “Review this code. Check for security, performance, readability, and test coverage. Rate each finding as critical, warning, or note. Give an overall verdict.” Consistent dimensions and output, but still a one-off prompt.

Rung 4 — Skill: The structured request plus trigger, references to coding standards, and a defined output contract. Consistent, reusable, improvable, shareable.

Most developers operate at Rung 1 or 2. Skills live at Rung 4. The gap between Rung 2 and Rung 4 is not AI capability — it is specification quality.

Component 3: References

References are external documents that the skill reads to inform its execution. They solve a fundamental problem: you cannot fit everything the AI needs to know into the prompt itself.

Types of References

Standards documents — coding guidelines, style guides, API conventions:

references:
- coding-standards.md # Variable naming, function patterns
- security-checklist.md # OWASP top 10 checks for this stack
- api-conventions.md # Endpoint naming, error response format

Format references — templates, examples, structures:

references:
- blog-formats.md # 8 post structures with examples
- tone-and-voice.md # Voice profiles for different audiences
- seo-checklist.md # SEO requirements and meta description rules

Domain knowledge — project-specific information the AI cannot infer:

references:
- architecture-decisions.md # ADRs explaining why things are built this way
- glossary.md # Domain terms with precise definitions
- known-issues.md # Bugs and workarounds the AI should know about

Reference Design Principles

  1. References should be standalone documents. A reference that only makes sense in the context of the skill prompt is not a reference — it is part of the prompt. Move it there.

  2. References should be reusable. Your coding standards document should work for the code review skill, the code generation skill, and the refactoring skill. If you are duplicating content across skills, extract it into a shared reference.

  3. Keep references focused. A 50-page coding standards document is not helpful. The AI will struggle to find the relevant section. Break large documents into focused files: naming conventions, error handling patterns, testing requirements.

  4. References decay. A reference that was accurate six months ago may describe patterns your team has since abandoned. Review references on the same cadence as your context files.

Component 4: The Output Contract

The output contract defines what the skill produces. Without it, the same skill can generate a three-sentence summary or a five-page report depending on the AI’s mood.

Defining Output Structure

## Output Format
### Summary
One paragraph: what was reviewed, overall quality assessment, key risk areas.
### Findings
For each finding:
- **File**: path/to/file.ts
- **Line**: 42
- **Severity**: CRITICAL | WARNING | NOTE
- **Category**: Security | Performance | Readability | Correctness | Testing
- **Finding**: One sentence describing the issue
- **Suggestion**: One sentence describing the fix
### Verdict
APPROVE | REQUEST_CHANGES | BLOCK
### Metrics
- Files reviewed: N
- Findings: N critical, N warnings, N notes

A defined output contract gives you three things:

  1. Parseable results. If every code review produces findings in the same format, you can build automation on top — tracking findings over time, generating reports, feeding back into CI.

  2. Comparable results. When two different people run the same skill on the same code, the output structure matches. You can compare results, identify gaps, and calibrate the skill.

  3. Quality measurement. You can evaluate whether a skill is working by checking its output against the contract. Missing sections? Inconsistent severity ratings? Wrong format? These are signals that the skill prompt needs refinement.

Output Contract Patterns

Structured verdict: Use when the skill makes a decision (approve/reject, pass/fail, safe/unsafe).

Checklist report: Use when the skill audits against criteria (accessibility audit, security review, compliance check).

Transformation: Use when the skill converts input to output (blog outline to draft, requirements to test cases, schema to migration).

Recommendation list: Use when the skill suggests actions (optimization opportunities, refactoring candidates, dependency updates).

Putting It All Together

Here is a complete skill for writing blog posts, combining all four components:

name: blog-writer
description: "Writes publication-ready blog posts with strong hooks,
clear structure, and audience-appropriate voice"
trigger: "When the user asks to write, draft, or create a blog post"
prompt: |
## Workflow
1. Clarify scope — gather topic, audience, tone, format, word count
2. Select format — pick the best structure from blog-formats reference
3. Draft headlines — write 3-5 options, recommend the strongest
4. Write the post — follow the selected format's structure
5. Polish — apply readability and SEO checks
## Rules
- Open with a hook that earns the next paragraph
- Every section must justify its existence
- Use concrete examples, numbers, and specifics
- Match voice to audience (default: Professional Conversational)
- End with a clear, specific call to action
references:
- blog-formats.md # 8 post structures with examples
- seo-and-structure.md # Headlines, hooks, SEO, readability
- tone-and-voice.md # Voice profiles and audience adaptation
output: |
Markdown with:
- H1 for the title
- H2 for major sections
- H3 for subsections (sparingly)
- Fenced code blocks with language tags
- Bold for key phrases (1-2 per section max)
- Meta description as a comment if SEO was requested

Notice how each component handles a distinct concern. The trigger says when. The prompt says how. The references provide knowledge. The output contract says what it looks like. Change any one component and the others still hold.

Common Skill Anti-Patterns

The kitchen sink skill

A skill that tries to do everything: review code, write tests, update docs, and deploy. This skill will do all four things poorly. Split it into four focused skills.

The rigid skill

A skill with so many specific rules that it cannot adapt to edge cases. “Always use exactly 3 H2 sections” breaks when the content naturally needs 5. Use constraints for quality, not for structure.

The stateless skill

A skill that ignores what happened before it ran. A deploy skill that does not check git status first. A review skill that does not read the PR description. Skills should gather context from the environment, not just from the user’s message.

The undocumented skill

A skill with no description, no clear trigger, and an opaque prompt. If someone on your team cannot understand what the skill does by reading it for thirty seconds, it needs better documentation.

From Anatomy to Practice

You now understand the four components of a skill and the design principles that make each one effective. In Part 4, we put this knowledge to work: identifying a repetitive workflow in your daily work and extracting it into your first custom skill, step by step.

The best way to internalize these concepts is to look at a skill you already use — or a prompt you keep retyping — and map it to the four components. Which pieces are there? Which are missing? The missing pieces explain why the results are inconsistent.

0

Next in this series

AI Skills in Practice: Building Your First Skill

Continue reading