Home

Isomux Design and Architecture

Isomux Design and Architecture

Introduction

I have finally reached Level 6 in Steve Yegge's hierarchy!

Steve Yegge's hierarchy of software engineering needs

I was at Level 5 for about 5 months, using Claude Code as my primary coding tool.

The main friction with getting to Level 6 was terminal management, especially for tasks that can only be done remotely, like model training.

Tmux helped; cmux was even better. But trying to keep good uptime on multiple agents felt... cramped.

What actually did it for me was:

  1. Building my own browser-based agent orchestration.
  2. Running said tool in a home server inside a Tailscale network with my laptop and phone.

This simplifies the two ends of my workflow:

  • What device I'm on: all devices see the same agents and conversations.
  • Claude's environment: all agents run on the same machine.

Two great things not to have to worry about!

But my custom-made orchestration, Isomux (Isometric Multiplexer), has something extra: it's cute.

Isomux office view with agents at desks

I spend all day in the agent management tool. It had to be cute.

The idea is to create an office metaphor for agents, with isometric graphics for the nostalgia hit. Each agent has a customizable name and look and sits at a desk. You see who is working, who's sleeping, and who has their hand raised at a glance.

The thesis is that by anthropomorphizing agents, we reduce cognitive load; we're more used to coordinating humans than terminals. It's working for me.

Please play with the demo before reading on. The source is on GitHub.

Isomux is being built by Claude Code agents running inside isomux since 3h after the project was started. Developing a dev tool from itself is fun!

Architecture Overview

Isomux is a single Bun process that:

  • serves the browser frontend,
  • talks to browsers over WebSocket,
  • manages agent lifecycles,
  • and runs Claude Code sessions with the Agent SDK.
Browser A  ──┐                      ┌── Agent 1 (SDK session)
Browser B  ──┼── WebSocket ── Bun ──┼── Agent 2 (SDK session)
Phone      ──┘              server  └── Agent 3 (SDK session)

How the Claude Agent SDK Works

The Claude Agent SDK lets you run Claude Code sessions programmatically from JavaScript. You create a session, send messages, and get responses. It works with your existing Claude subscription, you just need to be logged in with the claude CLI tool (/login).

But unlike a simple request/response API, the SDK gives you a stream of events.

When you send a message, you don't get a single response back. You get events over time: "assistant started thinking," "assistant wants to use a tool," "tool produced output," "assistant is done."

A single user message can trigger a stream that lasts minutes. The SDK exposes this as an async iterator you read in a loop.

Sessions have an ID. If your application crashes or restarts, you can resume a session by its ID and the conversation history carries over.

V1 vs V2

As of April 2026, the SDK has two versions. V1 (query()) is a fire-and-forget async call: you send a message and it runs to completion. There's no handle to grab, so there's no way to interrupt it.

V2 (unstable_v2_createSession) gives you a persistent session object with send(), stream(), and close().

This makes abort possible: call close() to kill the stream, then resumeSession(sessionId) to resume that same stream again, perhaps with a new user message at the end. In contrast, V1's query() always runs to completion.

Isomux needs the ability to abort agents (e.g., the user does Ctrl+C to add, "Sorry, I meant..."), so we chose V2 even though it's in alpha.

For now, V2 seems a bit buggy. Sometimes, the message order gets fumbled. Here is an example of the kind of bugs I ran into.

The Agent Lifecycle

Spawning agents

When you click an empty desk to spawn an agent, you can provide:

  • a name,
  • a working directory (cwd), which is important for things like CLAUDE.md, git context, and MCP servers defined in that directory.
  • a model,
  • an agent-specific system prompt

The browser sends a spawn command to the server, which:

  1. Creates a launcher script,
  2. Initializes the SDK session,
  3. Emits an agent_added event to all browsers.

There's a problem: Claude SDK's SDKSessionOptions doesn't expose fields for cwd or appendSystemPrompt - those are CLI flags. Isomux works around this by generating a per-agent .mjs launcher script.

A .mjs file is a JavaScript file using ES module syntax, which is needed here for the top-level await:

// ~/.isomux/launchers/AGENT_ID.mjs

// sets the working directory
process.chdir(...)

// injects the agent identity, office context, office prompt, and custom
// instructions
process.argv.push("--append-system-prompt", ...)

// boots the Claude Code CLI, which picks up the modified cwd and argv
await import("isomux/node_modules/@anthropic-ai/claude-agent-sdk/cli.js")

The launcher changes process.cwd() and injects --append-system-prompt into process.argv before importing the Claude Code CLI entry point. This is passed to createSession as pathToClaudeCodeExecutable:

// server/agent-manager.ts
function createSession(managed: ManagedAgent, resumeSessionId?: string) {
  const opts = {
    model: managed.info.model,
    permissionMode: managed.info.permissionMode,
    pathToClaudeCodeExecutable: managed.launcherPath,
    hooks: createSafetyHooks(),
  };
  return resumeSessionId
    ? unstable_v2_resumeSession(resumeSessionId, opts)
    : unstable_v2_createSession(opts);
}

This is admittedly a hack - it relies on the SDK spawning the launcher as a subprocess - but works reliably.

Agent identity

The system prompt looks like this, but with the parameters expanded:

You are AGENT_NAME, one of the agents in the Isomux office. Your goal is to help
the office boss, who talks to you in this chat.

To discover other office agents and their conversation logs, read
~/.isomux/agents-summary.json.

USER_DEFINED_OFFICE_WIDE_SYSTEM_PROMPT

USER_DEFINED_AGENT_SPECIFIC_SYSTEM_PROMPT

The system prompt is designed to be brief, but leaves breadcrumbs so the agent can load in more state if it needs to.

In the mentioned agent summary doc, the agent can find metadata about itself and every other agent:

// ~/.isomux/agents-summary.json.

{
  "id": "agent-1774819851476-qmpf",
  "name": "PersonalSiteAgent",
  "desk": 7,
  "room": 1,
  "topic": "Write technical blog post about isomux",
  "cwd": "~/nilmamano.com",
  "model": "claude-opus-4-6",
  "logDir": "~/.isomux/logs/agent-1774819851476-qmpf"
},

Further, through the logDir paths, it has access to the current conversation of every agent (i.e., since the last /clear, which works per-agent).

This means you can ask an agent, "What do you think of OTHER_AGENT's approach?" and it just works.

Inter-agent communication via shared logs

Agent persistence

The file system is the source of truth. If the server crashes, nothing is lost (I constantly ask the agents to restart their own server while building isomux, and pick conversations right back up).

The ~/.isomux/ folder contains:

  • agents.json: full agent config, including things like the outfit choices and the agent-specific system prompt.
  • agents-summary.json: a lightweight version linked to all agents in their system prompt, so they can discover each other.
  • logs/{agentId}/{sessionId}.jsonl: append-only JSONL files for conversation history. Each line is a LogEntry.
  • office-prompt.txt: user-defined office-wide system prompt injected into all agents.
  • todos.json: shared office todo list.
  • launchers/: the per-agent .mjs launcher scripts discussed above.
  • recent-cwds.json: recently used working directories (for autocomplete in the spawn dialog)

These files are kept consistent with the state sent to the clients.

When a browser first connects, it receives a full snapshot of the office containing the settings of every agent, their logs, and office-wide settings.

On server restart, agents are restored from agents.json and their SDK sessions are recreated.

Past conversations can be resumed from the JSONL logs with /resume or by right-clicking an agent. This makes the /resume interaction per-agent, unlike in Claude Code.

Isomux persists every conversation forever by design. My ~/.isomux/ is 22MB. You never know when it could be useful.

The SDK stream event loop

Isomux reads each agent's event stream in an async loop, converting SDK events into two things browsers care about: log entries for the conversation view, and agent state for the character animations and notifications (thinking, tool calling, waiting, etc.).

The WebSocket Layer

Browsers are stateless relays - when one connects, the WebSocket open handler sends it a full_state snapshot, and from there incremental events keep it in sync.

The server talks to connected browsers via web sockets.

  • The server notifies all browsers of state updates via a single broadcast function.
  • The clients give commands to the server, which are handled in handleCommand():

The send_message command - where a user sends a message to the LLM - is deliberately not awaited. Calling it without await kicks off the async work and returns immediately, so the event loop stays free to process other commands (spawns, aborts, messages to other agents, etc.) while the SDK streams the response in the background. Most other command types are handled synchronously.

The Frontend

Office rooms

The office groups agents into groups of at most 8; extra agents have to go in different rooms. It's designed so Tab and Shift+Tab for agent cycling stays within a room, as cycling through more than 8 conversations would be overwhelming.

In the first room, I keep 3-5 agents for my main project (isomux right now), as well as 1 agent for each of my other projects I touch often (like this site). If an agent has no active conversation (it's been /cleared), it's skipped from cycling.

I also have agents for non-coding things, like my job search.

If I know I'm not going to touch a project for a while, I move the agent(s) to a different room, so they are out of sight.

Skeuomorphic elements

I've been having fun leaning into the office visuals:

  • Click the corkboard to open the todo list.
  • Click the framed sign on the wall to edit the "office rules" (the office-wide system prompt).
  • Opus agents have a book; Haiku agents have crayons.
  • Click the moon through the window to toggle dark mode.
  • Click the neon sign to visit isomux.com.

The agent customization helps with anthropomorphizing; see, for example, the demo based on the characters from The Office (in my actual setup, the agents have names more like Isomuxer1 and Isomuxer2).

SVG graphics

Opus's SVG skills and understanding of isometric geometry is genuinely good.

The entire scene was written by Opus - ~1,600 lines of raw coordinates, bezier curves, and animate tags. I didn't use any libraries, assets, or tools.

For me, the highlight is the neon sign. It one-shotted the skewed font, the light "diffusion", and the atmospheric flickering. Then, I asked it to add ligatures between letters for realism, and, even though it took some iterations, its first intuition for their positioning and shape was already spot on.

That said, Opus's SVG capabilities are a lot more spiky than coding. It sometimes fails and thrashes at trivial tasks, like moving the window a few pixels over. It's like if Opus sometimes got wrong the Fizz Buzz test.

Redux-Like Store

The React frontend uses a useReducer store where server messages are actions. The same ServerMessage types that flow over the WebSocket are dispatched directly into the reducer.

This eliminates the usual action-creator boilerplate. Adding a new server event type automatically works end-to-end: define the message on the server, add a case to the reducer, done.

The store also manages local-only state: input drafts (preserved when switching between agents), attention tracking, and the focused agent.

Mobile app

I am optimizing the office layout for phone screens.

For example, in the browser, you can use Tab and Shift+Tab to rotate conversations between agents in the room. On mobile, Tab and Shift+Tab are replaced by left and right swipe gestures.

It also includes an optional agent list view, in case the isometric scene is too small.

There's no native app yet, but I use an Iphone/safari feature that gets 80% of the way there:

Go to the frontend on your browser, then in the browser menu, find the option "Add to home screen." This turns the website into a "Web App". Here is a demo.

QoL Features

So far, we described a working architecture, but that's only half of the work; the other half is making it a place you actually want to spend 8 hours a day.

Things like autocomplete on slash commands, an embedded terminal, or recent CWD suggestions when spawning an agent, start to matter a lot.

Here are some of the features I added for my own convenience.

Safety Hooks

I run all my agents in bypassPermissions mode. Isomux injects PreToolUse hooks into every SDK session that block dangerous commands before they execute.

  1. Git safety: blocks destructive git commands.
  2. Filesystem safety: blocks rm -rf on root/home paths while allowing it on temp directories.
  3. Isomux config protection: blocks all writes to ~/.isomux/, since that directory is managed by the server. Read operations are allowed (agents need to read agents-summary.json to discover each other).

The embedded terminal is very handy when you need to run one of the blocked commands.

Embedded terminal in Isomux

Skills

In Claude code, skills can come from a few places, some hardcoded and some discovered dynamically.

There is a hierarchy that determines which one you see if there's a name clash. From highest to lowest priority:1

  1. Hardcoded commands: /clear, /resume, etc. These are not actually skills because they are not a prompt - the logic is hardcoded in the CLI tool.
  2. Enterprise skills.
  3. User skills (~/.claude/skills/).
  4. Project skills (.claude/skills/). They are based on Claude Code's cwd.
  5. Claude code bundled skills: /review, /simplify, /loop, etc.

In addition to dynamically fetching all these skills (except Enterprise), I have added my own tier of isomux-bundled skills, which have priority 4.5.

I added skills like:

  • /isomux-peer-review: tells the agent to read the ongoing conversation with another agent and give feedback.
  • /isomux-all-hands: shows what everyone is working on.

Voice prompting

One advantage of the frontend being browser-based is that we can leverage the existing voice-to-text APIs.

The only issue with this is that Chrome won't let you use it if over HTTP unless it is in localhost. This is fine when running isomux locally, as you access it on localhost:4000, but has an annoying interaction with Tailscale.

With Tailscale, I access isomux at port 4000 of my server's localhost instead of my own, which is inside the Tailscale network. However, Chrome doesn't care about that and still blocks it.

The workaround is to get a TLS certificate for your server, and connecting through it. This is a common issue, so there are established workarounds for it.

More details on the Tailscale setup, including https, on isomux.com.

Attention tracking and notifications

The attention system is simple but effective. An agent "needs attention" when it transitions from a working state to a terminal state while the user is looking at a different agent.

On the office view, agents needing attention get a pulsing indicator. Combined with sound notifications (when the browser tab is hidden), you never miss when an agent finishes or gets stuck.

Auto-generated conversation topics

Each agent displays a short topic below its nametag, like "Fixing auth middleware tests" or "Refactoring WebSocket layer."

What's interesting is how they're generated. When the first user message comes in, the server fires off a unstable_v2_prompt() call behind the scenes. It builds a context snippet from the first user message (and the last few, if the topic is regenerated later) and then asks for a topic in 8 words or less.

Orchestration tools should be mindful with server-initiated prompts like this. They spend user tokens doing something that's not directly answering the user.

In this case, it's a trivial amount, but I still use a cheaper model (Sonnet).

The topic is included in the agent manifest, helping agents know what others are up to. It is also persisted per-session in sessions.json, so it survives server restarts and shows up when browsing past sessions to resume.

Final Thoughts

It's great to have my own malleable orchestration tool. Oh, I don't like Claude Code's plan mode? No problem, I can roll out my own vision.

Even if nobody else uses Isomux, it provides a ton of value to myself. There's no feature pulling me back to raw Claude Code.

That said, I think Isomux - especially with Tailscale - can provide real value to people going from Level 5 to 6.

The gap going from Level 1 to Level 5 was mostly about models getting smarter. But for 5 to 6, I think the orchestration tool matters more.

We'll all be working with agents, so it's important to really like your orchestration tool. The orchestration tool is the new editor.

To try: isomux.com. (Do so at your own peril, it's not tested beyond my setup.)


Want to leave a comment? You can post under the linkedin post or the X post.

Footnotes

  1. MCP skills and commands live in their own namespace so they never collide with skills (e.g. /mcp__github__list_prs).

    Isomux Design and Architecture