What's Actually Inside Claude Code (It's More Impressive Than You Thin

A deep dive into the surprising engineering behind Claude Code — React in your terminal, parallel startup tricks, a resilient query loop, and more.

March 31, 2026 · 10 min read

I've been using Claude Code for a while now, and like most developers, I treated it as a black box. Type something, get code. Tab-complete, accept, move on. But someone recently did a full reverse-engineering of the Claude Code source and published their findings — and honestly, it changed how I think about "simple" developer tools.

This is my read of those findings, with the stuff I found most surprising.

It's a React App. In Your Terminal.

Let's start with the thing that stopped me mid-scroll: Claude Code doesn't just use your terminal. It has a custom React reconciler built specifically for terminal rendering.

The usual story for terminal UIs is: draw some text, move the cursor around, print escape codes. It works, but it's fragile. Claude Code took a completely different approach — they wrote a custom React host that targets the terminal the same way React DOM targets the browser. Components, state, re-renders, the whole thing.

On top of that, they're using Yoga — Facebook's flexbox layout engine — to do the actual positioning. So when you see a box in Claude Code's output, that box was actually calculated by the same engine that powers React Native layouts. Flexbox. In your terminal.

And it doesn't stop there. There's double buffering (render to a back buffer, swap with front buffer on frame completion), a dirty-flag cascade so only changed subtrees get re-laid out, and hardware scroll regions for fast scrolling. These are techniques you'd expect in a game engine, not a CLI tool.

The rendering pipeline looks roughly like this:

React Components
  → Custom React Reconciler
    → Virtual DOM (ink-box, ink-text, ink-root)
      → Yoga flexbox layout
        → 2D screen buffer (with style interning)
          → Diff with previous frame
            → ANSI escape sequences → your terminal

I'd genuinely call this the most sophisticated terminal renderer I've seen outside of a dedicated TUI library.

Startup is Secretly Parallel

Most CLI tools start serially — load config, authenticate, initialize, then show you something. Claude Code starts differently.

Before any TypeScript modules even finish importing, it fires off three things in parallel: reading macOS MDM policy, and two separate keychain reads for API keys and OAuth tokens. These are expensive OS-level calls that would normally block startup.

The insight is subtle: TypeScript module evaluation takes about 135ms anyway because it's inherently sequential. You can't make it faster. But you can overlap it with slow I/O by spawning subprocesses immediately. By the time imports finish, your credentials are already loaded. The keychain reads become essentially free.

The rest of startup is 16 stages — API preconnection, TLS setup, proxy configuration, IDE detection — but a lot of it is parallelized or deferred. The goal is to get something on screen as fast as possible.

There's also a sampled profiler built in: 100% of internal builds, 0.5% of external users. If you're not in the sampled group, you pay zero overhead. CLAUDE_CODE_PROFILE_STARTUP=1 forces full profiling if you want to dig in yourself.

The Query Loop is a Resilient State Machine

The core of Claude Code — the thing that sends your message to the model and handles the response — is a while(true) loop in a file called query.ts. That's not a criticism; it's the right shape for the problem.

Each iteration of the loop:

Checks if the context window is getting full and runs compaction if needed
Calls the API with streaming
Handles any streaming errors
Executes tool calls (in parallel where safe, serially otherwise)
Decides whether to continue or return

The interesting part is how it handles failures. When the model runs out of output tokens mid-response, the loop doesn't give up — it injects a hidden meta-message ("Resume directly, no recap") and keeps going. It'll do this up to 3 times before surfacing the issue to you.

When the context window fills up (HTTP 413), it tries compaction: summarize the old messages, replace them with a boundary marker, continue. The error is withheld from you until recovery fails. Most of the time, you never know it happened.

Tool execution uses a smart concurrency model:

partitionToolCalls(blocks):
  Batch 1: Read-only tools A, B, C  → run concurrently (max 10)
  Batch 2: Write tool D              → run serially
  Batch 3: Read-only tools E, F      → run concurrently

Each tool declares whether it's concurrency-safe. File reads, glob, grep — run in parallel. File edits, bash with side effects — run serially with context passed between batches. It even starts executing some tools while the model is still streaming the response, overlapping computation and I/O.

60+ Tools, One Generic Interface

Every tool in Claude Code — bash execution, file editing, web search, spawning subagents — conforms to a single generic TypeScript interface. Input schema, output type, progress events, four rendering methods (use/progress/result/error), permission check, concurrency flag.

This uniformity is what lets the system treat MCP tools and built-in tools identically once they're registered. It's also what makes the permission system clean.

One thing I hadn't thought about: tool schemas are sorted alphabetically before being sent to the API. This sounds trivial but it's actually a cache optimization — keeping the tool list in the same order across requests maximizes prompt cache hits on Anthropic's side, which reduces latency and cost.

Not all tools are sent in every request either. About 18 tools are "deferred" — hidden from the base prompt until the model explicitly searches for them using a ToolSearchTool. The model asks "do you have a task creation tool?", gets the schema back, and calls it in the same turn. This keeps the base context under 200K tokens while still supporting 60+ tools.

The Permission System Has Five Modes and ML Classifiers

When Claude Code asks "do you want to run this command?", there's a whole system behind that question.

The five modes:

default — ask for destructive operations
plan — read-only, show proposed actions
acceptEdits — auto-approve file edits, ask for shell commands
bypassPermissions — full trust (explicit opt-in, labeled dangerous)
dontAsk — auto-deny anything unsafe

Rules support glob patterns: Bash(git push*) allows any git push, Bash(python:*) allows all Python execution. And there's a "dangerous pattern" detector — rules that are too broad (like Bash(*)) get flagged rather than silently auto-allowed.

Every permission check returns one of three things: allow (optionally with a modified input), ask (show dialog), or deny (return error to model). The allow case can even modify the tool's input transparently — a pre-execution hook could add safety flags to a shell command before it runs.

The default for any new tool is "ask". You have to explicitly opt into auto-approval.

Context Management is a Multi-Stage Fight

Modern Claude models have large context windows, but long coding sessions can still push limits. Claude Code handles this with several compaction strategies, not just one.

Auto-compaction: When tokens approach the limit (context window minus 13K), it strips images from old messages, groups messages by API round, calls the model to summarize the old conversation, and replaces it with a boundary marker. It re-injects up to 5 relevant files and any loaded skills afterward.

Microcompaction: Lighter weight. Clear old tool results based on age or size, targeting only tools that produce large outputs (file reads, grep results, bash output).

Context collapse: Staged — collapses are prepared lazily and only committed if the API actually returns a 413. This avoids unnecessary summarization.

Token budget continuation: When output runs out mid-task, inject a resume prompt and keep going, up to 3 times.

The system also tracks prompt cache state carefully, since disturbing the cached portion of the prompt has a real cost.

100+ Slash Commands, Three Types

Claude Code has over 100 slash commands, organized into three categories:

PromptCommand: Expands to text sent to Claude (/commit, /review, /explain)
LocalCommand: Runs locally and outputs text (/clear, /status, /cost)
LocalJSXCommand: Renders a React component in the terminal (/config, /mcp, /doctor)

Commands are lazy-loaded and memoized by working directory. The registry merges built-in commands with skills, plugins, MCP-provided commands, and workflow commands.

Remote and bridge modes have filtered command lists. Over WebSocket to claude.ai, LocalJSX commands are blocked entirely — you can't render a React terminal component over a bridge.

The /commit command has explicit safety rules baked in: never --amend, never --no-verify, never interactive mode. It only allows git add, git status, and git commit. The commit tool has its own hardcoded allowed-tools list separate from the session permissions.

Multi-Agent Coordination and Worktrees

Claude Code can spawn child agents — separate conversation contexts with their own tool access. They can run in three isolation modes:

Default: Shared filesystem, separate context
Worktree: Isolated git branch, changes merged on completion or discarded
Remote: Runs on a separate machine entirely

Agents are addressable by name. The model can say "ask the test-runner agent to run the suite" and a SendMessage call routes to the right agent. Task state is tracked with file-based IPC — outputs go to ~/.claude/, lock files coordinate concurrent access with retry backoff.

Background tasks use base-36 encoded IDs with type prefixes (b=bash, a=agent, r=remote). There's a full status enum: pending, running, completed, failed, killed.

The Engineering Patterns Worth Stealing

Looking across the whole system, a few patterns stand out:

Lazy everything: Zod schema instantiation is deferred, modules are imported on demand, tools are discovered progressively. This keeps startup fast and memory bounded.

Intern everything that gets compared repeatedly: Character strings, ANSI style combinations, and hyperlink URLs all go through interning pools. Style transitions are pre-computed as a lookup table — going from style A to style B is an O(1) integer lookup, not an O(n) diff.

Make safety the default: New tools default to "ask". The permission system fails closed. Dangerous patterns are detected and flagged rather than silently allowed.

Centralize side effects: All state mutations that affect external systems go through one function (onChangeAppState). No scattered useEffect side effects across the codebase.

File-based IPC for multi-agent: No sockets, no shared memory. Task outputs go to files. Transcripts are written to disk before API calls, so conversations survive process death.

The Takeaway

Claude Code isn't a thin wrapper around an API. It's a complete development environment that happens to live in your terminal — with a custom renderer, a resilient query loop, an elastic tool system, multi-agent coordination, and layered error recovery that quietly handles most failures before you ever see them.

The next time it "just works" through a context overflow or a model fallback, now you know what's running underneath.

What's Actually Inside Claude Code (It's More Impressive Than You Think)