Core Concepts

Capstan serves two roles: (1) an AI agent framework for building intelligent agents with durable execution, self-evolution, and production robustness, and (2) a full-stack web framework for building typed HTTP/MCP/A2A/OpenAPI applications. Both share the same Bun-native runtime. This document covers the agent framework first (Part 1) because it is the primary use case, then the web framework (Part 2).

Part 1: Smart Agent

createSmartAgent

createSmartAgent is the central API. It takes a configuration object and returns a SmartAgent with two methods: run(goal) and resume(checkpoint, message).

Here is a production agent in 30 lines:

import { createSmartAgent } from "@zauso-ai/capstan-ai";
import { anthropicProvider } from "@zauso-ai/capstan-agent";

const agent = createSmartAgent({
  llm: anthropicProvider({
    apiKey: process.env.ANTHROPIC_API_KEY!,
    model: "claude-sonnet-4-20250514",
  }),
  tools: [readFile, writeFile, runCommand, searchCode],
  skills: [debuggingSkill, refactoringSkill],
  evolution: {
    store: myEvolutionStore,
    capture: "every-run",
    distillation: "post-run",
  },
  maxIterations: 200,
  contextWindowSize: 200_000,
  fallbackLlm: openaiProvider({ apiKey: process.env.OPENAI_API_KEY!, model: "gpt-4o" }),
  llmTimeout: { chatTimeoutMs: 120_000, streamIdleTimeoutMs: 90_000 },
});

const result = await agent.run("Fix the failing test in src/parser.test.ts");

console.log(result.status);      // "completed" | "max_iterations" | "fatal" | ...
console.log(result.iterations);  // how many loop iterations it took
console.log(result.toolCalls);   // full tool call trace

SmartAgentConfig Reference

Property	Type	Required	Description
`llm`	`LLMProvider`	Yes	Primary language model
`tools`	`AgentTool[]`	Yes	Operations the agent can invoke
`skills`	`AgentSkill[]`	No	Strategic guidance the agent can activate
`evolution`	`EvolutionConfig`	No	Self-evolution: experience capture, distillation
`memory`	`SmartAgentMemoryConfig`	No	Scoped memory with pluggable backend
`maxIterations`	`number`	No	Max loop iterations (default: 10)
`contextWindowSize`	`number`	No	Context window size for compression decisions
`fallbackLlm`	`LLMProvider`	No	Backup model when primary fails
`tokenBudget`	`number \| TokenBudgetConfig`	No	Output token budget with nudge + force-complete
`toolResultBudget`	`ToolResultBudgetConfig`	No	Per-result and aggregate truncation limits
`llmTimeout`	`LLMTimeoutConfig`	No	Watchdog timeouts for chat and streaming
`hooks`	`SmartAgentHooks`	No	Lifecycle hooks (before/after tool calls, etc.)
`compaction`	`Partial<CompactionConfig>`	No	Compression tuning (snip, microcompact, autocompact)
`stopHooks`	`StopHook[]`	No	Quality gates on final responses

The Agent Loop

runSmartLoop is the engine inside createSmartAgent. It implements an 8-phase iteration cycle.

Phase 1: Initialization

Before the first iteration:

Create engine state from config (tools, messages, counters)
Build tool catalog (inline all tools, or defer large sets behind a discover_tool meta-tool)
Inject activate_skill synthetic tool if skills are configured
Retrieve relevant memories from memory store
Compose system prompt (base prompt + tool descriptions + skill catalog + memories + strategies)
Set initial messages: [system prompt, user goal]

Phase 2: Main Loop

Each iteration runs these steps:

Compression Check -- if tokens exceed 60%, run snip + microcompact; if 85%, run autocompact (LLM-driven summarization)
Control Check -- operator can return "pause" or "cancel" via getControlState hook
Model + Tool Execution -- call LLM, parse tool calls, validate arguments (JSON Schema + custom validate), execute tools with timeout and concurrency
Error Handling -- on context limit error: autocompact recovery then reactive compact then fatal; on other error: try fallbackLlm
Token Budget -- at nudge threshold inject wrap-up message; at 100% force-complete
Result Processing -- error withholding (retry once), tool result budgeting, memory event hook
Dynamic Context Enrichment -- every 5 iterations, query memory for fresh relevant context
Continuation Decision -- run stop hooks; if rejected, inject feedback and continue (max 3 rejections); otherwise complete

Phase 3: Post-Loop

If the loop exits due to maxIterations, the last assistant message becomes the result with status "max_iterations".

Tools

Tools are operations with defined inputs and outputs -- reading files, running commands, calling APIs. They are validated in two phases before execution: JSON Schema validation (parameters) and custom validation (validate).

import type { AgentTool } from "@zauso-ai/capstan-ai";

const readFile: AgentTool = {
  name: "read_file",
  description: "Read the contents of a file at the given path",
  parameters: {
    type: "object",
    properties: {
      path: { type: "string", description: "Absolute file path" },
      offset: { type: "integer", description: "Line to start reading from" },
      limit: { type: "integer", description: "Max lines to read" },
    },
    required: ["path"],
  },
  validate(args) {
    if ((args.path as string).includes(".."))
      return { valid: false, error: "Path traversal not allowed" };
    return { valid: true };
  },
  timeout: 10_000,
  isConcurrencySafe: true,
  failureMode: "soft",
  async execute(args) {
    const content = await Bun.file(args.path as string).text();
    return { content, lines: content.split("\n").length };
  },
};

Tool result budgeting: large results are truncated and optionally persisted to disk. The agent gets a read_persisted_result tool automatically to retrieve the full data.

Concurrent execution: tools marked isConcurrencySafe: true can execute in parallel. Configure max parallelism with streaming: { maxConcurrency: 4 }.

Skills

Skills are strategies, not operations. They provide high-level guidance for how to approach a class of problems. When activated, a skill's prompt is injected into the conversation.

Aspect	Tool	Skill
What it is	An operation with I/O	A strategy with guidance text
Invocation	Model calls it with arguments	Model activates it by name
Result	Concrete data (file contents, etc.)	Injected guidance prompt
Side effects	Yes (reads/writes/network)	No (read-only prompt injection)
Source	Developer-defined	Developer-defined or auto-evolved

import { defineSkill } from "@zauso-ai/capstan-ai";

const debuggingSkill = defineSkill({
  name: "debugging",
  description: "Systematic debugging methodology",
  trigger: "When encountering bugs, test failures, or unexpected behavior",
  prompt: `## Debugging Strategy
1. REPRODUCE: Confirm the failure by running the exact failing test.
2. ISOLATE: Narrow down to the smallest reproducing case.
3. HYPOTHESIZE: Form a specific hypothesis about the root cause.
4. VERIFY: Test the hypothesis with targeted reads/searches.
5. FIX: Apply the minimal fix that addresses the root cause.
6. CONFIRM: Re-run the original failing test to verify.`,
  tools: ["read_file", "run_command", "search_code"],
});

At runtime: skills are listed in the system prompt, the runtime injects a synthetic activate_skill tool, and the agent calls it when needed.

Memory

The memory system provides scoped, searchable memory that persists across agent runs.

const agent = createSmartAgent({
  // ...
  memory: {
    store: new BuiltinMemoryBackend(),
    scope: { type: "project", id: "my-app" },
    readScopes: [{ type: "global", id: "shared" }],
    maxMemoryTokens: 4000,
    saveSessionSummary: true,
  },
});

Features: initial retrieval before the first iteration, staleness annotations (age-based freshness notes), dynamic enrichment every 5 iterations, session summary auto-save. Backends: BuiltinMemoryBackend (in-memory), SqliteMemoryBackend (persistent), or custom.

Self-Evolution

Self-evolution enables agents to learn from their runs and improve over time:

Experience (run trajectory) --> Strategy (distilled pattern) --> Skill (promoted guidance)

Run 1-5: Raw experience capture. Each run records goal, outcome, tool call trajectory, iterations, duration.
Run 3+: Strategy distillation. The LLM-driven distiller analyzes trajectories and extracts generalizable strategies.
Run 10+: Strategy refinement. Consolidator merges overlapping strategies, resolves contradictions. Utility scores: +0.1 success, -0.05 failure.
Run 50+: Skill promotion. Strategies reaching utility >= 0.7 after >= 5 applications are auto-promoted to reusable skills.

evolution: {
  store: new SqliteEvolutionStore("./agent-evolution.db"),
  capture: "every-run",          // "every-run" | "on-failure" | "on-success" | custom
  distillation: "post-run",      // "post-run" | "manual"
  pruning: { maxStrategies: 50, minUtility: 0.2 },
  skillPromotion: { minUtility: 0.7, minApplications: 5 },
}

Production Robustness

The agent loop includes nine robustness mechanisms:

Mechanism	Description
Model fallback	When primary LLM fails, strip thinking blocks and retry with fallbackLlm
Reactive compression	3-phase: autocompact -> reactive compact -> fatal
Token budget	Nudge at 80% + force-complete at 100%
LLM watchdog	Chat timeout (120s), stream idle timeout (90s), stall warning (30s)
Tool timeout	Per-tool configurable timeout via Promise.race
Error withholding	Retry failed tools once before exposing error to LLM
Message normalization	Merge adjacent same-role messages, filter empties
Input validation	Two-layer: JSON Schema + custom validate hook
Abort handling	Blocked tool calls produce synthetic results so the LLM can adjust

Lifecycle Hooks

Hooks provide fine-grained control over agent execution. All hooks are optional and non-fatal.

Hook	When	Purpose
`beforeToolCall`	Before each tool execution	Gate: return { allowed: false } to block
`afterToolCall`	After each tool execution	Observe results. Status is "success" or "error"
`onCheckpoint`	At initialization, tool_result, completion	Save or modify checkpoints
`getControlState`	Before LLM, before/after tools	Operator control: pause or cancel
`onRunComplete`	Once at end of run	Final notification/logging
`afterIteration`	After each iteration	Progress monitoring

Checkpoints and Resume

Every agent run produces checkpoints that can be used to resume interrupted runs:

const result = await agent.run("Deploy the new feature");

if (result.status === "paused") {
  const resumed = await agent.resume(result.checkpoint!, "Approved. Continue deployment.");
}

if (result.status === "approval_required") {
  const { tool, args, reason } = result.pendingApproval!;
  // After human approval:
  const resumed = await agent.resume(result.checkpoint!, "Approved. Proceed.");
}

Stop Hooks (Guardrails)

Stop hooks are quality gates evaluated when the model produces a final response. If any hook fails, the response is rejected with feedback and the agent continues. After 3 consecutive rejections, the agent is force-completed.

Part 2: Full-Stack Web

defineAPI()

defineAPI() is the central building block for web endpoints. A single call defines a typed handler projected to HTTP, MCP, A2A, and OpenAPI.

import { defineAPI } from "@zauso-ai/capstan-core";
import { z } from "zod";

export const POST = defineAPI({
  input: z.object({
    title: z.string().min(1).max(200),
    priority: z.enum(["low", "medium", "high"]).default("medium"),
  }),
  output: z.object({
    id: z.string(),
    title: z.string(),
    status: z.string(),
  }),
  description: "Create a new ticket",
  capability: "write",
  resource: "ticket",
  policy: "requireAuth",
  async handler({ input, ctx }) {
    return { id: crypto.randomUUID(), title: input.title, status: "open" };
  },
});

Multi-Protocol Projection

One defineAPI() call simultaneously creates endpoints across four protocols:

defineAPI() --> CapabilityRegistry
                  |-- HTTP JSON API (Hono)
                  |-- MCP Tools (@modelcontextprotocol/sdk)
                  |-- A2A Skills (Google Agent-to-Agent)
                  +-- OpenAPI 3.1 Spec

HTTP -- Input from query params (GET) or JSON body (POST/PUT/PATCH/DELETE)
MCP -- Each route becomes a tool. GET /tickets becomes get_tickets
A2A -- Each route becomes a skill via JSON-RPC tasks/send
OpenAPI -- Each route becomes an operation with full schema generation

Auto-Generated Endpoints

Endpoint	Protocol	Description
`GET /.well-known/capstan.json`	Capstan	Agent manifest with all capabilities
`GET /.well-known/agent.json`	A2A	Agent card with skills list
`POST /.well-known/a2a`	A2A	JSON-RPC task handler
`POST /.well-known/mcp`	MCP	Streamable HTTP MCP endpoint
`GET /openapi.json`	OpenAPI	OpenAPI 3.1 specification
`GET /capstan/approvals`	Capstan	Approval workflow management

File-Based Routing

Routes live in app/routes/. The router scans the directory tree and maps files to URL patterns.

File Pattern	Route Type	Description
`*.api.ts`	API	API handler (exports HTTP methods)
`*.page.tsx`	Page	React page component (SSR)
`_layout.tsx`	Layout	Wraps nested routes via <Outlet>
`_middleware.ts`	Middleware	Runs before handlers in scope
`_loading.tsx`	Loading	Suspense fallback for pages
`_error.tsx`	Error	Error boundary for pages

Dynamic segments use [param], catch-all uses [...param], route groups use (name) (transparent in URL).

definePolicy()

Policies define permission rules evaluated before route handlers.

import { definePolicy } from "@zauso-ai/capstan-core";

export const requireAuth = definePolicy({
  key: "requireAuth",
  title: "Require Authentication",
  effect: "deny",
  async check({ ctx }) {
    if (!ctx.auth.isAuthenticated) {
      return { effect: "deny", reason: "Authentication required" };
    }
    return { effect: "allow" };
  },
});

Policy Effects

Effect	Behavior
`allow`	Request proceeds normally
`deny`	Request is rejected with 403 Forbidden
`approve`	Request is held for human approval (returns 202 with approval ID)
`redact`	Request proceeds but response data may be filtered

When multiple policies apply, all are evaluated and the most restrictive effect wins: allow < redact < approve < deny.

defineModel (Database)

Capstan uses Drizzle ORM for data modeling. defineModel() creates typed table definitions with auto-generated CRUD route helpers.

import { defineModel } from "@zauso-ai/capstan-db";
import { text, integer } from "drizzle-orm/sqlite-core";

export const ticket = defineModel("ticket", {
  title: text("title").notNull(),
  priority: text("priority").default("medium"),
  status: text("status").default("open"),
});

Features: migrations, vector search, and generated CRUD endpoints that integrate with defineAPI() and the multi-protocol registry.

Verification Loop

capstan verify --json runs an 8-step cascade against your application:

Step	Checks
`structure`	Required files exist
`config`	Config file loads and has a valid export
`routes`	API files export handlers, write endpoints have policies
`models`	Model definitions valid
`typecheck`	tsc --noEmit passes
`contracts`	Models match routes, policy references valid
`manifest`	Agent manifest matches live routes
`protocols`	HTTP, MCP, A2A, OpenAPI schema consistency

Output includes repairChecklist with fixCategory and autoFixable flags, enabling an AI self-repair loop.

AI in Web Handlers

The AI toolkit integrates with web handlers via the request context:

export const POST = defineAPI({
  // ...
  async handler({ input, ctx }) {
    const analysis = await ctx.think(input.message, {
      schema: z.object({ intent: z.string(), confidence: z.number() }),
    });

    await ctx.remember(`User asked about: ${analysis.intent}`);
    const history = await ctx.recall(input.message);

    return { analysis, relatedHistory: history };
  },
});

think() returns structured data via Zod schema parsing. generate() returns raw text. Both have streaming variants.