Blog

62 posts on engineering, architecture, and technical craft.

Ai Ml

Skills — Giving an Agent a Playbook

Bonus post. Skills are not a new rung on the ladder — they're a way of packaging the rungs you already have, so an agent can pull a mini-playbook off the shelf on demand instead of carrying every instruction in its head. Here's the concept, the mechanism, and when to reach for one.

11 min read
Ai Ml

Multi-Agent Systems & Production Platforms

One agent is a worker. A team of agents with a supervisor, evals, tracing, guardrails, and cost control is a platform. Here's when multi-agent actually helps, when it hurts, and the four pieces of scaffolding that turn a demo into a product you can run.

12 min read
Ai Ml

Memory — How Agents Build Continuity

Context windows forget. Production agents don't. The difference is a layered architecture: working memory, session memory, and long-term memory split into facts, events, and skills. Here's how real agents remember — and why forgetting on purpose matters.

10 min read
Ai Ml

The Agent Loop — ReAct, Plan-Act-Observe

One tool call is an API. A loop of tool calls with reasoning in between is an agent. This post walks through the four-step cycle that turns one-shot chat into step-by-step work — and the surprisingly tricky question of when to stop.

10 min read
Ai Ml

Tool Use — Giving the Model Hands

Text in, text out — until you let the model call functions. This is the moment a chatbot stops explaining how to do things and starts actually doing them. Here's what function calling really is, the typed contract that makes it work, and why this is the hinge rung of the whole ladder.

11 min read
Ai Ml

RAG — Giving the Model a Library

Your company's docs, this morning's tickets, yesterday's deploy notes — none of it is in the model. Retrieval-augmented generation hands it the right page, right before it answers. Here's the two-phase pipeline, honest failure modes, and what 'embeddings' actually are.

11 min read
Ai Ml

System Prompts & Personas: The Cheapest Control Surface

Before RAG, before tools, before agents — a well-written system prompt is still the single highest-leverage knob you have. Here's what it actually does, why most people misuse it, and what its honest limits are.

9 min read
Ai Ml

The Chat Baseline: What You're Starting With

Every AI product in the world starts as the same small, strange thing — a stateless function that turns tokens into tokens. Understand that substrate clearly and every capability above it stops looking like magic.

9 min read
Architecture

The Tool System: How an AI Gets Hands

A language model without tools is an expensive autocomplete. This post dissects how a production AI harness defines, registers, validates, and executes 40+ tools — from file reads to shell commands to MCP integrations — with type safety, concurrency control, and deferred loading.

9 min read
Architecture

Anatomy of an AI Harness: What Lives Between You and the Model

Everyone debates which AI model is best. The real engineering happens in the harness — the production system of tools, permissions, memory, and orchestration that makes any model actually useful. This is a map of that system, drawn from real source code.

9 min read