Claude Code Compression - Edgee documentation

Claude Code Compression is the Edgee compression bundle tuned for Claude Code traffic — the three named strategies described in Token Compression, pre-configured for the Claude API wire format and the typical shape of a Claude Code session. You choose which strategies to enable. The CLI turns on a sensible default; the Console lets you toggle each one per API key.

Strategy	What it does for Claude Code	Default	Customer-traffic average
Tool Result	Trims `tool_result` payloads (file reads, grep, shell, API responses) before they reach Claude. Lossless.	✅ on	−19%
Tool Surface	Drops MCP servers irrelevant to the current task via a virtual MCP that routes to the correct server.	⚠️ opt-in	~−25% projected
Output	Reduces verbosity of model responses without losing technical content. Same answer, fewer tokens.	⚪ opt-in	−6.5% when enabled

Per-strategy averages don’t aggregate — they’re measured on different baselines. Customer aggregate token-bill reduction across active customers (rolling 30 days) sits at approximately 20%, with zero measurable drift on SWE-Bench Verified samples.

Tool Result Trimming

tool_result_trimming filters the tool-result content Claude Code sends back to itself: file contents, grep and search outputs, shell stdout/stderr, API responses, database query results. Lossless on tool_result payloads — Claude receives the same technical content with redundant framing removed. User messages and assistant turns are not modified. → Full strategy reference: Token Compression / Tool Result Trimming.

Tool Surface Reduction

Edgee creates a virtual MCP server that Claude sees. The virtual MCP classifies the user’s task and searches for the correct real MCP server to use. It sends the result back to the client, which then executes the real MCP server. The IDE still exposes everything; nothing changes for the developer’s setup. → Full strategy reference: Token Compression / Tool Surface Reduction.

Output Brevity

output_brevity reduces the verbosity of Claude’s responses. Three levels are available (light, medium, hard). Off by default for Claude Code sessions because output is a small share (~1%) of total token volume — turn it on if your Claude Code workflow leans heavy on long-form responses. → Full strategy reference: Token Compression / Output Brevity.

Receipts

+26.2% more instructions completed on the same Claude Pro plan. 20.8% more efficient per instruction. 5.1% cheaper per task on a cost-adjusted basis. Source: edgee-ai/claude-compression-lab · Endurance challenge writeup

Get started

The fastest path is the Edgee CLI. tool_result_trimming is on by default; the other two strategies are opt-in toggles in the Console.

macOS / Linux
Homebrew
Windows (PowerShell)

curl -fsSL https://install.edgee.ai | bash

brew install edgee-ai/tap/edgee

irm https://install.edgee.ai/install.ps1 | iex

Verify the installation:

edgee --version

Then launch Claude Code through Edgee:

edgee launch claude

After your session, the CLI prints a link to view per-strategy savings in the Edgee Console.

Full CLI guide

Install the CLI, authenticate, and launch Claude Code in under a minute.

Toggling individual strategies

In the Edgee Console, open Dashboard and manage your Claude Code’s settings right from the UI.

Enable tool_surface_reduction to opt into the tool-surface compression.
Enable output_brevity if your Claude Code workflow produces long-form output worth tightening.
Disable tool_result_trimming only when you want to compare against an uncompressed baseline.

For team-managed keys, the same toggles are available per-member from Team management → agent settings. See Team management.

Manual setup (advanced)

If you prefer not to use the CLI, configure Claude Code to route through Edgee — see Manual setup — then enable the strategies you want from the Edge Models section of the Console.

Lossiness

tool_result_trimming is lossless on tool_result payloads. tool_surface_reduction is lossless on the model’s perspective: Claude still sees the virtual MCP that can route to any real MCP server. output_brevity is not lossless on the prose dimension, it intentionally compresses prose verbosity.

Token Compression

Deep dive on each strategy.

https://mintcdn.com/edgee/CrNen493EQpoYoa2/images/icons/codex.svg?fit=max&auto=format&n=CrNen493EQpoYoa2&q=85&s=0f19fa96ee1277109c66c3b411f868c0

Codex Compression

Same three strategies, tuned for Codex.

​Tool Result Trimming

​Tool Surface Reduction

​Output Brevity

​Receipts

​Get started