Home

Context Composer: Editing Your Agent's Context Behind Its Back

Context Composer: Editing Your Agent's Context Behind Its Back

Context accumulates junk, degrading attention and increasing costs.

The standard answers are coarse:

  • /clear (lose everything)
  • /compact (no control over what survives)

Recently, new patterns are emerging to give more control:1

  • Conversation forking to remove the last N turns.
  • Subagents for token-expensive computations or searches, where only the final result makes it to the main context.
  • /btw for aside questions that don't stay in context.

To me, the interesting question is:

What is the final form of context management? What if we give users full control?

To examine this, I built Context Composer, a proxy that sits between Claude Code and Anthropic's servers.

It lets you see the raw text flowing through, but more importantly, it gives you full control to edit it.

It's like doing brain surgery on your agent:

  • Delete obsolete versions of code that you've iterated upon
  • Compact only unimportant messages
  • Fix past hallucinations with free-form editing
  • Offload to the filesystem context that's unlikely to be needed again
  • Strip a 10KB tool result that mattered for just one turn

You can do all this from a web UI or a CLI, which means an agent can do it too:

The vision is that you can go to an agent and say something like,

"strip from context the part where we went in the wrong direction, summarize long tool results and delete useless ones, and compact that tangent we went on about lunch"

and it just works.

Instead of /compact, we could create a new skill, /compose-context, where you pass a description like the one above, and the agent translates it into CLI calls.

This post explains how it works. The source is at github.com/nmamano/context-composer.

How to use it

To intercept traffic with the proxy, set ANTHROPIC_BASE_URL when running Claude Code:

bun run proxy                                    # start the proxy
ANTHROPIC_BASE_URL=http://localhost:8788 claude  # a stock Claude Code

Then, use Claude Code as usual. The proxy is credential-agnostic, so your existing Claude subscription works.

You can open http://localhost:8788/ui to manipulate its context. The CLI is mentioned below.

Overview

The Anthropic API is stateless, which means that Claude Code resends the entire conversation on every turn. That resend is the interception point:

Claude Code  ──►   Context Composer (proxy)   ──►  Anthropic API
               decompose → reconcile → recompose

Each incoming request is:

  1. decomposed into context frames, which are "units of context" that can be manipulated independently.
  2. reconciled against the proxy's frame store.
  3. recomposed with your edits applied before forwarding.

The full resend also means that we don't rely on capturing responses coming from the server. The proxy only acts on the forward pass.

Frames

Each user message plus reply becomes one frame. This screenshot from the browser UI shows the frame view:

Context Composer's frame view

Frames have auto-generated titles by a cheap model.

The system prompt is also just a frame, which means that you can, e.g., delete it, though the provider (Anthropic's servers) will likely reject your request.

The operation catalog

The operations operate on frames and are available via browser UI or CLI.

  • delete: remove a frame.
  • revert: undo any past operation.
  • edit: edit the frame's content freely.2
  • compact: swap the frame for a short summary.
  • offload: park a frame's content in a file; a stub with the file path takes its place, so the agent can read it back on demand with its own file tools.
  • restore: undo an offload.
  • move: reorder frames.
  • add: insert a new frame at any point in the context, with any content you want.
  • combine: merge frames.
  • drop-results: drop tool results in a frame, keeping the calls.
  • summarize-results: summarize tool results.
  • split: cut a frame into parts:
The split form on a tool-loop frame: its four messages numbered #0-#3 (question, Read tool call, tool result, final answer) with 'cut here' checkboxes between them; the last ticked, separating the tool loop from the answer

CLI

The CLI talks to the running proxy via its local control API (default: localhost:8788), so it operates on the same live state.

By default, the CLI targets the conversation with the most recent activity, i.e., the one you're actively having in your Claude Code session.

bun run ctx conversations     # list conversations (active one marked *)
bun run ctx list              # list the frames (ids, titles, token counts)
bun run ctx show t3           # print one frame's full content + metadata
bun run ctx delete t3         # drop a frame
bun run ctx offload t2        # move the content to a file, leaving a reference
bun run ctx revert            # undo the last operation
bun run ctx compose --dump    # print the context provider would get right now

In practice, I expect the browser UI to be most useful for understanding and debugging context issues. Actual context shaping should probably be driven by an agent via the CLI.

How it works

Reconciliation

Claude Code is unaware of our proxy. If you delete Turn 3 at the proxy level, the agent still believes Turn 3 happened, so it resends it on every request.

Thus, the proxy needs to keep track of past frames and mark ones that have been deleted.

The same one-to-one matching handles edits, filesystem offloads, etc.: the agent keeps resending the original, while the proxy keeps a map from it to your edited version.

Identifying conversations

Claude Code reuses the same connection for background requests: title generation, quota probes, next-message suggestions, ...

To the proxy, these look indistinguishable from user conversations, so the proxy needs a way to identify when a request is for your conversation vs a background request.

The same issue arises when you /resume a session: the agent resends the whole conversation, and the proxy has to recognize it as the same one it saw before so your edits are still there.

The proxy handles all of this by identifying conversations based on the first message after the system prompt. Claude Code is unaware of your edits, so it resends the original first message on every turn (even if you later edit it in the proxy), so it acts as a stable identifier for the conversation.

This works for background requests because they have distinct first messages. One limitation is that two different user conversations opening with the exact same first message get mixed up. The proxy has no way to tell them apart, and I haven't found an easy workaround yet.

Browser UI

There are three views:

  • Conversation view: the chat as the server/model currently sees it - different from what the Claude Code client sees. Clicking a message opens a side panel with the content and metadata for its frame.
Context Composer's conversation view: five short turns as chat bubbles; the middle turn ('Name one primary color' / 'Red.') is selected, opening the details side panel with that frame's title, summary, fields, and messages
  • frame view: the main manipulation surface.
The details panel for a frame: title and description editable in place with AI-regenerate buttons, metadata fields, and the frame's messages with per-message edit pencils - plus an explicit 'current emission' vs 'source' split for overridden frames
  • history view: a git-like history of all context edits. Every context mutation is a commit. Chat messages are captured as well, creating a complete append-only timeline of context changes. This provides observability and makes changes reversible.
The history tab: five commits - a delete wearing a 'reverted' chip, the revert that undid it, an offload with its artifact path, an add, and a split - each with frame links and a revert button

Limitations

KV Caching

There is a good reason why existing context management tools are "append-only" shaped: if you edit a single token in the middle of the context, the provider needs to rebuild the KV cache from that token onward.

So, frequent and small context modifications are inefficient (unless new caching mechanisms are invented). There are two practical ways of using Context Composer:

  • Make many edits in a row. You pay for those only once, when you send the next message.
  • Focus single edits only on the latest frames, letting the old ones settle into a good shape.

Hands-on management

By design, the proxy doesn't do anything on its own; it just forwards faithfully. Editing the context is on you.

As mentioned, the best way to use the tool is probably through an agent using the CLI.

Known bugs

Two conversations that start with the exact same first message get mixed - the proxy can't tell them apart. See here.

Claude Code only

The frame model isn't provider-specific, but the decompose/compose layer is.


Want to leave a comment? You can post under the linkedin post or the X post.

Footnotes

  1. My own meta-harness, isomux, has a /soft-handoff command, where one agent hands off a task to another agent via a direct message, while staying around to answer questions from the new agent (see more here). This works across models from different providers, like a Claude-to-Codex handoff.

  2. You can use this to "gaslight" the agent, making it think it said things it didn't say. E.g.: (1) ask it for 9*11; (2) edit the frame to a wrong answer; (3) ask how much is 9*11 again. It will likely appear confused and apologize about getting it wrong before.

    Context Composer: Editing Your Agent's Context Behind Its Back