Architecture

Tasks and Concurrency: Background Agents at Work

A production AI harness is not single-threaded. Background agents explore codebases, shell commands execute, remote agents run on cloud infrastructure — all while the main conversation continues. This post dissects the task system that manages this concurrency.

Tin Dang avatar
Tin Dang
Hand-drawn architecture diagram of an AI harness with the Tasks and Concurrency layer highlighted

When you ask an AI assistant to “research the authentication patterns in this codebase and then implement the new login flow,” you are describing two operations: a research phase and an implementation phase. In a single-threaded system, these run sequentially — the model reads for 30 seconds, then writes for 60 seconds, while you wait.

In a production harness, the model can delegate research to a background agent while it starts planning the implementation. Multiple operations run concurrently. The main conversation continues while background work executes. Results flow back when ready.

This is the task system.

Seven Task Types

The system manages seven distinct types of concurrent work:

TypeDescriptionID Prefix
local_bashShell command executionb
local_agentBackground agent (sub-query)a
remote_agentCloud-hosted agentr
in_process_teammateIn-process team membert
local_workflowWorkflow script executionw
monitor_mcpMCP resource monitorm
dreamAutomatic memory consolidationd

Each type generates task IDs with a type prefix followed by 8 random base-36 characters (36^8 = 2.8 trillion combinations). The prefix makes IDs visually distinguishable: a7k9d2x1 is an agent, b3f8v6w2 is a bash command.

Task Lifecycle

Every task follows a state machine:

pending → running → completed
→ failed
→ killed

The isTerminalTaskStatus() function returns true for completed, failed, and killed — the three states from which a task never returns.

Registration

function registerTask(task: TaskState, setAppState) {
// Check for replacement (resume case)
const existing = appState.tasks[task.id]
if (existing && existing.retain) {
// Merge with UI-held state to preserve display
task = { ...task, ...preservedUIFields(existing) }
}
// Store in centralized state
setAppState(prev => ({
...prev,
tasks: { ...prev.tasks, [task.id]: task }
}))
// Notify SDK consumers
emitSDKEvent('task_started', { taskId: task.id })
}

Output streaming

Each task writes output to a disk file at {sessionDir}/tasks/{taskId}. A read pointer (outputOffset) tracks how much the consumer has read, enabling incremental delta extraction:

function getTaskOutputDelta(taskId, currentOffset) {
const content = readFileFrom(outputFile, currentOffset)
return {
delta: content,
newOffset: currentOffset + content.length
}
}

This means the main conversation can check on a background task and see only what has changed since the last check — not the entire output history.

Polling

The harness polls task state at 1-second intervals:

function pollTasks(getAppState, setAppState) {
const tasks = getAppState().tasks
for (const task of Object.values(tasks)) {
if (isRunning(task)) {
// Get output delta for status display
const delta = getTaskOutputDelta(task.id, task.outputOffset)
updateOffset(task.id, delta.newOffset)
}
if (isTerminal(task) && task.notified) {
// Eligible for eviction
if (pastGracePeriod(task)) {
evictTask(task.id)
}
}
}
}

The Notification Problem

The hardest challenge in the task system is not parallelism — it is notification delivery. When a background agent completes, the main conversation needs to know. But the main conversation might be mid-API-call, or waiting for user input, or processing its own tool results.

The solution: atomic notification delivery.

function enqueueAgentNotification(taskId, result) {
let shouldEnqueue = false
updateTaskState(taskId, setAppState, task => {
if (task.notified) return task // Already sent — no-op
shouldEnqueue = true
return { ...task, notified: true }
})
if (shouldEnqueue) {
enqueuePendingNotification({
value: formatTaskNotification(taskId, result),
mode: 'task-notification'
})
}
}

The notified flag is a check-and-set guard: only the first call succeeds. This prevents duplicate notifications when multiple code paths detect the same task completion (race condition between polling and direct completion callback).

Notification format

Task notifications use XML tags that the model is trained to parse:

<task_notification>
<task_id>a7k9d2x1</task_id>
<tool_use_id>toolu_abc123</tool_use_id>
<status>completed</status>
<summary>Analyzed 47 files in src/auth/</summary>
<result>
Found 3 authentication patterns:
1. JWT-based (/api/protected/*)
2. API key (/api/external/*)
3. Session cookie (/admin/*)
</result>
<usage>
<total_tokens>24891</total_tokens>
<tool_uses>23</tool_uses>
<duration_ms>18432</duration_ms>
</usage>
</task_notification>

The model receives this as a system message and can reference the results in its next response.

Local Agent Tasks: The Workhorse

The most common task type is local_agent — a background sub-query running the same model with its own context window.

State shape

type LocalAgentTaskState = TaskStateBase & {
agentId: string
prompt: string
agentType: string // 'general-purpose', 'code-reviewer', etc.
model?: string
isBackgrounded: boolean // false = foreground, true = background
pendingMessages: string[] // Queued during tool rounds
progress?: {
toolUseCount: number
tokenCount: number
recentActivities: ToolActivity[] // Last 5
summary?: string
}
}

Progress tracking

As the background agent works, the harness tracks its activity:

type ToolActivity = {
toolName: string
input: Record<string, unknown>
activityDescription?: string // "Reading src/auth/jwt.ts"
isSearch?: boolean
isRead?: boolean
}

The last 5 activities are kept in a FIFO queue, giving the UI a real-time view of what the agent is doing. A separate background summarization periodically compresses the full activity history into a one-line summary.

Backgrounding the main conversation

A unique variant: the user can background the main conversation itself. Pressing Ctrl+B twice converts the current query into a LocalMainSessionTask:

function registerMainSessionTask(messages, state) {
return registerTask({
type: 'local_agent',
agentType: 'main-session',
prompt: extractCurrentPrompt(messages),
isBackgrounded: true,
// ... links output to existing transcript
})
}

The conversation continues executing in the background while the user gets their terminal back. When complete, the task notification appears and the user can resume.

Coordinator Mode: Multi-Agent Orchestration

For complex tasks, a single agent is not enough. Coordinator mode enables a lead agent to spawn, direct, and synthesize results from multiple worker agents.

The coordinator receives special system prompt instructions:

Task workflow phases:
1. Research — Parallel research agents explore different aspects
2. Synthesis — Combine research into a coherent plan
3. Implementation — Serial workers execute plan steps
4. Verification — Verify results match intent
Concurrency guidelines:
- Research: parallel (no file conflicts)
- Implementation: serial (prevents write races)
- Verification: after all implementation complete

Worker tool pools

Workers get a restricted tool set. The coordinator knows exactly what each worker can do:

const workerTools = isSimpleMode
? [BashTool, FileReadTool, FileEditTool] // Minimal
: allTools.filter(t => !isInternalTool(t)) // Full minus internal

The coordinator’s system prompt lists available tools so it can make informed delegation decisions: “This worker can read and edit files but cannot run web searches — delegate the documentation lookup to a different worker.”

Communication protocol

Coordinator and workers communicate through two mechanisms:

  1. AgentTool — The coordinator spawns new workers with a description and prompt
  2. SendMessageTool — The coordinator continues an existing worker with additional instructions

Workers cannot communicate with each other directly — all coordination flows through the lead agent. This prevents message races and ensures the coordinator maintains a complete picture of the work.

Remote Agent Tasks

Some operations need more resources than a local machine provides. Remote agents execute on cloud infrastructure:

type RemoteAgentTaskState = TaskStateBase & {
remoteTaskType: 'remote-agent' | 'ultraplan' | 'ultrareview' | 'background-pr'
sessionId: string
command: string
title: string
todoList: TodoList
log: SDKMessage[]
isLongRunning?: boolean
}

Remote tasks poll for status updates rather than streaming:

async function pollRemoteSessionEvents(taskId) {
while (!isTerminal(task.status)) {
const events = await fetchRemoteEvents(task.sessionId, lastEventId)
for (const event of events) {
updateTaskFromEvent(taskId, event)
}
await sleep(pollInterval)
}
}

Completion checkers

Remote tasks can register custom completion logic:

registerCompletionChecker(taskId, (task, event) => {
// Custom logic: is this review complete?
if (event.type === 'review_complete' && event.bugs.length === 0) {
return { status: 'completed', result: 'No issues found' }
}
return null // Not complete yet
})

Dream Tasks: Background Memory Consolidation

The most unusual task type: dreams. These are automatic background processes that consolidate session memories:

type DreamTaskState = TaskStateBase & {
type: 'dream'
phase: 'starting' | 'updating'
sessionsReviewing: number
filesTouched: string[]
turns: DreamTurn[] // Max 30
priorMtime: number // For lock rewinding on kill
}

Dream tasks review past conversation sessions and extract patterns, corrections, and insights into the persistent memory system. They are UI-only — the model never sees dream notifications, preventing self-referential loops.

Eviction and Cleanup

Terminal tasks do not persist forever. The eviction system cleans up:

function evictTerminalTask(taskId, setAppState) {
const task = appState.tasks[taskId]
// Guard conditions
if (!isTerminal(task.status)) return // Still running
if (!task.notified) return // Notification pending
if (task.retain) return // UI holding reference
if (withinGracePeriod(task, 30_000)) return // 30s grace for display
// Evict
delete appState.tasks[taskId]
}

The grace period (30 seconds for local agents, 3 seconds for killed tasks) ensures the UI has time to display the completion before the task disappears.

Architectural Insights

Three patterns from the task system apply broadly:

1. Atomic notifications solve the “who tells whom” problem. When multiple code paths can detect the same state change, a check-and-set flag ensures exactly one notification. This is simpler and more reliable than distributed coordination protocols.

2. Disk-backed output with read pointers enables efficient incremental consumption. Instead of buffering unbounded output in memory, tasks write to disk and consumers read deltas. This handles arbitrarily long-running tasks without memory pressure.

3. Type-prefixed IDs provide instant classification. When you see a7k9d2x1 in a log, you know it is an agent without looking it up. This small design decision compounds in debugging sessions where you are scanning thousands of events.

Next: State, Cost, and the Production Surface, where we examine the state management, cost tracking, rate limiting, and UI rendering that make all of this usable.

0

Next in this series

State, Cost, and the Production Surface

Continue reading