All posts
Tag

systems-design

6 posts

Architecture

The Tool System: How an AI Gets Hands

A language model without tools is an expensive autocomplete. This post dissects how a production AI harness defines, registers, validates, and executes 40+ tools — from file reads to shell commands to MCP integrations — with type safety, concurrency control, and deferred loading.

9 min read
Architecture

Anatomy of an AI Harness: What Lives Between You and the Model

Everyone debates which AI model is best. The real engineering happens in the harness — the production system of tools, permissions, memory, and orchestration that makes any model actually useful. This is a map of that system, drawn from real source code.

9 min read
Architecture

The Permission Boundary: Human-in-the-Loop at Scale

An AI with shell access and no guardrails will eventually destroy something you care about. This post dissects how a production harness implements layered permissions, hooks, dangerous pattern detection, and trust boundaries — balancing safety with usability.

10 min read
Architecture

The Orchestration Loop: Where Everything Converges

The orchestration loop is the heart of an AI harness — a state machine that coordinates API calls, streaming responses, concurrent tool execution, error recovery, and token budgets. This post traces the complete data flow from user input to final response.

9 min read
Architecture

State, Cost, and the Production Surface

The invisible foundation beneath every AI harness layer: centralized state management, per-model cost tracking, rate limit handling, a custom React-to-terminal renderer, and multiple entry points. This post covers what makes 'works in a demo' become 'works in production.'

9 min read
Architecture

Tasks and Concurrency: Background Agents at Work

A production AI harness is not single-threaded. Background agents explore codebases, shell commands execute, remote agents run on cloud infrastructure — all while the main conversation continues. This post dissects the task system that manages this concurrency.

8 min read