All posts
Tag

ai

23 posts

Ai Ml

Skills — Giving an Agent a Playbook

Bonus post. Skills are not a new rung on the ladder — they're a way of packaging the rungs you already have, so an agent can pull a mini-playbook off the shelf on demand instead of carrying every instruction in its head. Here's the concept, the mechanism, and when to reach for one.

11 min read
Ai Ml

Multi-Agent Systems & Production Platforms

One agent is a worker. A team of agents with a supervisor, evals, tracing, guardrails, and cost control is a platform. Here's when multi-agent actually helps, when it hurts, and the four pieces of scaffolding that turn a demo into a product you can run.

12 min read
Ai Ml

Memory — How Agents Build Continuity

Context windows forget. Production agents don't. The difference is a layered architecture: working memory, session memory, and long-term memory split into facts, events, and skills. Here's how real agents remember — and why forgetting on purpose matters.

10 min read
Ai Ml

The Agent Loop — ReAct, Plan-Act-Observe

One tool call is an API. A loop of tool calls with reasoning in between is an agent. This post walks through the four-step cycle that turns one-shot chat into step-by-step work — and the surprisingly tricky question of when to stop.

10 min read
Ai Ml

Tool Use — Giving the Model Hands

Text in, text out — until you let the model call functions. This is the moment a chatbot stops explaining how to do things and starts actually doing them. Here's what function calling really is, the typed contract that makes it work, and why this is the hinge rung of the whole ladder.

11 min read
Ai Ml

RAG — Giving the Model a Library

Your company's docs, this morning's tickets, yesterday's deploy notes — none of it is in the model. Retrieval-augmented generation hands it the right page, right before it answers. Here's the two-phase pipeline, honest failure modes, and what 'embeddings' actually are.

11 min read
Ai Ml

System Prompts & Personas: The Cheapest Control Surface

Before RAG, before tools, before agents — a well-written system prompt is still the single highest-leverage knob you have. Here's what it actually does, why most people misuse it, and what its honest limits are.

9 min read
Ai Ml

The Chat Baseline: What You're Starting With

Every AI product in the world starts as the same small, strange thing — a stateless function that turns tokens into tokens. Understand that substrate clearly and every capability above it stops looking like magic.

9 min read
Architecture

The Tool System: How an AI Gets Hands

A language model without tools is an expensive autocomplete. This post dissects how a production AI harness defines, registers, validates, and executes 40+ tools — from file reads to shell commands to MCP integrations — with type safety, concurrency control, and deferred loading.

9 min read
Architecture

Anatomy of an AI Harness: What Lives Between You and the Model

Everyone debates which AI model is best. The real engineering happens in the harness — the production system of tools, permissions, memory, and orchestration that makes any model actually useful. This is a map of that system, drawn from real source code.

9 min read
Architecture

The Permission Boundary: Human-in-the-Loop at Scale

An AI with shell access and no guardrails will eventually destroy something you care about. This post dissects how a production harness implements layered permissions, hooks, dangerous pattern detection, and trust boundaries — balancing safety with usability.

10 min read
Architecture

The Orchestration Loop: Where Everything Converges

The orchestration loop is the heart of an AI harness — a state machine that coordinates API calls, streaming responses, concurrent tool execution, error recovery, and token budgets. This post traces the complete data flow from user input to final response.

9 min read
Architecture

Skills: Packaging AI Workflows as Code

Ad-hoc prompting is fine for one-off questions. Repeatable workflows deserve structure. This post dissects how a production harness defines, discovers, loads, and executes skills — reusable AI workflows that turn tribal knowledge into executable automation.

8 min read
Architecture

State, Cost, and the Production Surface

The invisible foundation beneath every AI harness layer: centralized state management, per-model cost tracking, rate limit handling, a custom React-to-terminal renderer, and multiple entry points. This post covers what makes 'works in a demo' become 'works in production.'

9 min read
Architecture

Tasks and Concurrency: Background Agents at Work

A production AI harness is not single-threaded. Background agents explore codebases, shell commands execute, remote agents run on cloud infrastructure — all while the main conversation continues. This post dissects the task system that manages this concurrency.

8 min read
Architecture

Context Engineering: Building the Model's World

The model is only as good as the context it receives. This post dissects how a production AI harness constructs system prompts, loads project instructions, manages persistent memory, and compresses context when the window fills — all to give the model the right information at the right time.

9 min read
Ai Ml

From Chat to Agent: Why the Leap Matters

A chatbot answers. An agent finishes the job. The gap between them is not one feature — it's a stack of seven capabilities, built one rung at a time. This series walks that stack in plain English, for engineers and curious non-engineers alike.

9 min read
Tutorial

AI Skills in Practice: What Are AI Skills (And Why Prompting Isn't Enough)

You have been typing instructions into AI assistants one conversation at a time. Skills flip that model — they turn your best prompts into reusable, structured workflows that any team member can run. This post explains the shift from ad-hoc prompting to skill-driven development.

7 min read
Tutorial

AI Skills in Practice: Context Is the Skill

Every AI assistant starts each conversation knowing nothing about your project. Context files change that — they encode your stack, conventions, and constraints so the AI works with your codebase instead of against it. This post shows how to build the foundation layer that makes every skill smarter.

9 min read
Tutorial

AI Skills in Practice: Building Your First Skill

You have used AI skills. Now you build one. This post walks through the complete process: identifying a repetitive workflow, extracting it into a structured skill, testing it against real work, and iterating until it reliably produces quality results.

9 min read
Tutorial

AI Skills in Practice: Anatomy of a Skill

A skill is not a long prompt. It is a structured workflow with a trigger, a prompt body, references, and an output contract. This post breaks down each component, shows how they interact, and explains the design decisions that separate skills that work from skills that frustrate.

11 min read
Tutorial

AI Skills in Practice: Composing Skills — Agents, Hooks, and Pipelines

A single skill handles a single workflow. But real development involves chains of workflows — review then fix then test then commit. This post covers composition patterns: how skills delegate to other skills, how hooks automate triggers, and how pipelines chain skills into end-to-end workflows.

9 min read
Tutorial

AI Skills in Practice: Skill Patterns for Real Workflows

Four complete skill definitions for workflows developers actually do every day — debugging, code review, technical writing, and deployment. Each pattern is tool-agnostic, tested in production, and ready to adapt to your own projects.

13 min read