Core Concepts

Capstan serves two roles: (1) an AI agent framework for building intelligent agents with durable execution, self-evolution, and production robustness, and (2) a full-stack web framework for building typed HTTP/MCP/A2A/OpenAPI applications. Both share the same Bun-native runtime. This document covers the agent framework first (Part 1) because it is the primary use case, then the web framework (Part 2).

Part 1: Smart Agent

createSmartAgent

createSmartAgent is the central API. It takes a configuration object and returns a SmartAgent with two methods: run(goal) and resume(checkpoint, message).

Here is a production agent in 30 lines:

import { createSmartAgent } from "@zauso-ai/capstan-ai";
import { anthropicProvider } from "@zauso-ai/capstan-agent";

const agent = createSmartAgent({
  llm: anthropicProvider({
    apiKey: process.env.ANTHROPIC_API_KEY!,
    model: "claude-sonnet-4-20250514",
  }),
  tools: [readFile, writeFile, runCommand, searchCode],
  skills: [debuggingSkill, refactoringSkill],
  evolution: {
    store: myEvolutionStore,
    capture: "every-run",
    distillation: "post-run",
  },
  maxIterations: 200,
  contextWindowSize: 200_000,
  fallbackLlm: openaiProvider({ apiKey: process.env.OPENAI_API_KEY!, model: "gpt-4o" }),
  llmTimeout: { chatTimeoutMs: 120_000, streamIdleTimeoutMs: 90_000 },
});

const result = await agent.run("Fix the failing test in src/parser.test.ts");

console.log(result.status);      // "completed" | "max_iterations" | "fatal" | ...
console.log(result.iterations);  // how many loop iterations it took
console.log(result.toolCalls);   // full tool call trace

SmartAgentConfig Reference

PropertyTypeRequiredDescription
llmLLMProviderYesPrimary language model
toolsAgentTool[]YesOperations the agent can invoke
skillsAgentSkill[]NoStrategic guidance the agent can activate
evolutionEvolutionConfigNoSelf-evolution: experience capture, distillation
memorySmartAgentMemoryConfigNoScoped memory with pluggable backend
maxIterationsnumberNoMax loop iterations (default: 10)
contextWindowSizenumberNoContext window size for compression decisions
fallbackLlmLLMProviderNoBackup model when primary fails
tokenBudgetnumber | TokenBudgetConfigNoOutput token budget with nudge + force-complete
toolResultBudgetToolResultBudgetConfigNoPer-result and aggregate truncation limits
llmTimeoutLLMTimeoutConfigNoWatchdog timeouts for chat and streaming
hooksSmartAgentHooksNoLifecycle hooks (before/after tool calls, etc.)
compactionPartial<CompactionConfig>NoCompression tuning (snip, microcompact, autocompact)
stopHooksStopHook[]NoQuality gates on final responses

The Agent Loop

runSmartLoop is the engine inside createSmartAgent. It implements an 8-phase iteration cycle.

Phase 1: Initialization

Before the first iteration:

  1. Create engine state from config (tools, messages, counters)
  2. Build tool catalog (inline all tools, or defer large sets behind a discover_tool meta-tool)
  3. Inject activate_skill synthetic tool if skills are configured
  4. Retrieve relevant memories from memory store
  5. Compose system prompt (base prompt + tool descriptions + skill catalog + memories + strategies)
  6. Set initial messages: [system prompt, user goal]

Phase 2: Main Loop

Each iteration runs these steps:

  1. Compression Check -- if tokens exceed 60%, run snip + microcompact; if 85%, run autocompact (LLM-driven summarization)
  2. Control Check -- operator can return "pause" or "cancel" via getControlState hook
  3. Model + Tool Execution -- call LLM, parse tool calls, validate arguments (JSON Schema + custom validate), execute tools with timeout and concurrency
  4. Error Handling -- on context limit error: autocompact recovery then reactive compact then fatal; on other error: try fallbackLlm
  5. Token Budget -- at nudge threshold inject wrap-up message; at 100% force-complete
  6. Result Processing -- error withholding (retry once), tool result budgeting, memory event hook
  7. Dynamic Context Enrichment -- every 5 iterations, query memory for fresh relevant context
  8. Continuation Decision -- run stop hooks; if rejected, inject feedback and continue (max 3 rejections); otherwise complete

Phase 3: Post-Loop

If the loop exits due to maxIterations, the last assistant message becomes the result with status "max_iterations".

Tools

Tools are operations with defined inputs and outputs -- reading files, running commands, calling APIs. They are validated in two phases before execution: JSON Schema validation (parameters) and custom validation (validate).

import type { AgentTool } from "@zauso-ai/capstan-ai";

const readFile: AgentTool = {
  name: "read_file",
  description: "Read the contents of a file at the given path",
  parameters: {
    type: "object",
    properties: {
      path: { type: "string", description: "Absolute file path" },
      offset: { type: "integer", description: "Line to start reading from" },
      limit: { type: "integer", description: "Max lines to read" },
    },
    required: ["path"],
  },
  validate(args) {
    if ((args.path as string).includes(".."))
      return { valid: false, error: "Path traversal not allowed" };
    return { valid: true };
  },
  timeout: 10_000,
  isConcurrencySafe: true,
  failureMode: "soft",
  async execute(args) {
    const content = await Bun.file(args.path as string).text();
    return { content, lines: content.split("\n").length };
  },
};

Tool result budgeting: large results are truncated and optionally persisted to disk. The agent gets a read_persisted_result tool automatically to retrieve the full data.

Concurrent execution: tools marked isConcurrencySafe: true can execute in parallel. Configure max parallelism with streaming: { maxConcurrency: 4 }.

Skills

Skills are strategies, not operations. They provide high-level guidance for how to approach a class of problems. When activated, a skill's prompt is injected into the conversation.

AspectToolSkill
What it isAn operation with I/OA strategy with guidance text
InvocationModel calls it with argumentsModel activates it by name
ResultConcrete data (file contents, etc.)Injected guidance prompt
Side effectsYes (reads/writes/network)No (read-only prompt injection)
SourceDeveloper-definedDeveloper-defined or auto-evolved
import { defineSkill } from "@zauso-ai/capstan-ai";

const debuggingSkill = defineSkill({
  name: "debugging",
  description: "Systematic debugging methodology",
  trigger: "When encountering bugs, test failures, or unexpected behavior",
  prompt: `## Debugging Strategy
1. REPRODUCE: Confirm the failure by running the exact failing test.
2. ISOLATE: Narrow down to the smallest reproducing case.
3. HYPOTHESIZE: Form a specific hypothesis about the root cause.
4. VERIFY: Test the hypothesis with targeted reads/searches.
5. FIX: Apply the minimal fix that addresses the root cause.
6. CONFIRM: Re-run the original failing test to verify.`,
  tools: ["read_file", "run_command", "search_code"],
});

At runtime: skills are listed in the system prompt, the runtime injects a synthetic activate_skill tool, and the agent calls it when needed.

Memory

The memory system provides scoped, searchable memory that persists across agent runs.

const agent = createSmartAgent({
  // ...
  memory: {
    store: new BuiltinMemoryBackend(),
    scope: { type: "project", id: "my-app" },
    readScopes: [{ type: "global", id: "shared" }],
    maxMemoryTokens: 4000,
    saveSessionSummary: true,
  },
});

Features: initial retrieval before the first iteration, staleness annotations (age-based freshness notes), dynamic enrichment every 5 iterations, session summary auto-save. Backends: BuiltinMemoryBackend (in-memory), SqliteMemoryBackend (persistent), or custom.

Self-Evolution

Self-evolution enables agents to learn from their runs and improve over time:

Experience (run trajectory) --> Strategy (distilled pattern) --> Skill (promoted guidance)
  1. Run 1-5: Raw experience capture. Each run records goal, outcome, tool call trajectory, iterations, duration.
  2. Run 3+: Strategy distillation. The LLM-driven distiller analyzes trajectories and extracts generalizable strategies.
  3. Run 10+: Strategy refinement. Consolidator merges overlapping strategies, resolves contradictions. Utility scores: +0.1 success, -0.05 failure.
  4. Run 50+: Skill promotion. Strategies reaching utility >= 0.7 after >= 5 applications are auto-promoted to reusable skills.
evolution: {
  store: new SqliteEvolutionStore("./agent-evolution.db"),
  capture: "every-run",          // "every-run" | "on-failure" | "on-success" | custom
  distillation: "post-run",      // "post-run" | "manual"
  pruning: { maxStrategies: 50, minUtility: 0.2 },
  skillPromotion: { minUtility: 0.7, minApplications: 5 },
}

Production Robustness

The agent loop includes nine robustness mechanisms:

MechanismDescription
Model fallbackWhen primary LLM fails, strip thinking blocks and retry with fallbackLlm
Reactive compression3-phase: autocompact -> reactive compact -> fatal
Token budgetNudge at 80% + force-complete at 100%
LLM watchdogChat timeout (120s), stream idle timeout (90s), stall warning (30s)
Tool timeoutPer-tool configurable timeout via Promise.race
Error withholdingRetry failed tools once before exposing error to LLM
Message normalizationMerge adjacent same-role messages, filter empties
Input validationTwo-layer: JSON Schema + custom validate hook
Abort handlingBlocked tool calls produce synthetic results so the LLM can adjust

Lifecycle Hooks

Hooks provide fine-grained control over agent execution. All hooks are optional and non-fatal.

HookWhenPurpose
beforeToolCallBefore each tool executionGate: return { allowed: false } to block
afterToolCallAfter each tool executionObserve results. Status is "success" or "error"
onCheckpointAt initialization, tool_result, completionSave or modify checkpoints
getControlStateBefore LLM, before/after toolsOperator control: pause or cancel
onRunCompleteOnce at end of runFinal notification/logging
afterIterationAfter each iterationProgress monitoring

Checkpoints and Resume

Every agent run produces checkpoints that can be used to resume interrupted runs:

const result = await agent.run("Deploy the new feature");

if (result.status === "paused") {
  const resumed = await agent.resume(result.checkpoint!, "Approved. Continue deployment.");
}

if (result.status === "approval_required") {
  const { tool, args, reason } = result.pendingApproval!;
  // After human approval:
  const resumed = await agent.resume(result.checkpoint!, "Approved. Proceed.");
}

Stop Hooks (Guardrails)

Stop hooks are quality gates evaluated when the model produces a final response. If any hook fails, the response is rejected with feedback and the agent continues. After 3 consecutive rejections, the agent is force-completed.

Part 2: Full-Stack Web

defineAPI()

defineAPI() is the central building block for web endpoints. A single call defines a typed handler projected to HTTP, MCP, A2A, and OpenAPI.

import { defineAPI } from "@zauso-ai/capstan-core";
import { z } from "zod";

export const POST = defineAPI({
  input: z.object({
    title: z.string().min(1).max(200),
    priority: z.enum(["low", "medium", "high"]).default("medium"),
  }),
  output: z.object({
    id: z.string(),
    title: z.string(),
    status: z.string(),
  }),
  description: "Create a new ticket",
  capability: "write",
  resource: "ticket",
  policy: "requireAuth",
  async handler({ input, ctx }) {
    return { id: crypto.randomUUID(), title: input.title, status: "open" };
  },
});

Multi-Protocol Projection

One defineAPI() call simultaneously creates endpoints across four protocols:

defineAPI() --> CapabilityRegistry
                  |-- HTTP JSON API (Hono)
                  |-- MCP Tools (@modelcontextprotocol/sdk)
                  |-- A2A Skills (Google Agent-to-Agent)
                  +-- OpenAPI 3.1 Spec
  • HTTP -- Input from query params (GET) or JSON body (POST/PUT/PATCH/DELETE)
  • MCP -- Each route becomes a tool. GET /tickets becomes get_tickets
  • A2A -- Each route becomes a skill via JSON-RPC tasks/send
  • OpenAPI -- Each route becomes an operation with full schema generation

Auto-Generated Endpoints

EndpointProtocolDescription
GET /.well-known/capstan.jsonCapstanAgent manifest with all capabilities
GET /.well-known/agent.jsonA2AAgent card with skills list
POST /.well-known/a2aA2AJSON-RPC task handler
POST /.well-known/mcpMCPStreamable HTTP MCP endpoint
GET /openapi.jsonOpenAPIOpenAPI 3.1 specification
GET /capstan/approvalsCapstanApproval workflow management

File-Based Routing

Routes live in app/routes/. The router scans the directory tree and maps files to URL patterns.

File PatternRoute TypeDescription
*.api.tsAPIAPI handler (exports HTTP methods)
*.page.tsxPageReact page component (SSR)
_layout.tsxLayoutWraps nested routes via <Outlet>
_middleware.tsMiddlewareRuns before handlers in scope
_loading.tsxLoadingSuspense fallback for pages
_error.tsxErrorError boundary for pages

Dynamic segments use [param], catch-all uses [...param], route groups use (name) (transparent in URL).

definePolicy()

Policies define permission rules evaluated before route handlers.

import { definePolicy } from "@zauso-ai/capstan-core";

export const requireAuth = definePolicy({
  key: "requireAuth",
  title: "Require Authentication",
  effect: "deny",
  async check({ ctx }) {
    if (!ctx.auth.isAuthenticated) {
      return { effect: "deny", reason: "Authentication required" };
    }
    return { effect: "allow" };
  },
});

Policy Effects

EffectBehavior
allowRequest proceeds normally
denyRequest is rejected with 403 Forbidden
approveRequest is held for human approval (returns 202 with approval ID)
redactRequest proceeds but response data may be filtered

When multiple policies apply, all are evaluated and the most restrictive effect wins: allow < redact < approve < deny.

defineModel (Database)

Capstan uses Drizzle ORM for data modeling. defineModel() creates typed table definitions with auto-generated CRUD route helpers.

import { defineModel } from "@zauso-ai/capstan-db";
import { text, integer } from "drizzle-orm/sqlite-core";

export const ticket = defineModel("ticket", {
  title: text("title").notNull(),
  priority: text("priority").default("medium"),
  status: text("status").default("open"),
});

Features: migrations, vector search, and generated CRUD endpoints that integrate with defineAPI() and the multi-protocol registry.

Verification Loop

capstan verify --json runs an 8-step cascade against your application:

StepChecks
structureRequired files exist
configConfig file loads and has a valid export
routesAPI files export handlers, write endpoints have policies
modelsModel definitions valid
typechecktsc --noEmit passes
contractsModels match routes, policy references valid
manifestAgent manifest matches live routes
protocolsHTTP, MCP, A2A, OpenAPI schema consistency

Output includes repairChecklist with fixCategory and autoFixable flags, enabling an AI self-repair loop.

AI in Web Handlers

The AI toolkit integrates with web handlers via the request context:

export const POST = defineAPI({
  // ...
  async handler({ input, ctx }) {
    const analysis = await ctx.think(input.message, {
      schema: z.object({ intent: z.string(), confidence: z.number() }),
    });

    await ctx.remember(`User asked about: ${analysis.intent}`);
    const history = await ctx.recall(input.message);

    return { analysis, relatedHistory: history };
  },
});

think() returns structured data via Zod schema parsing. generate() returns raw text. Both have streaming variants.