Edge-native token compression

Compress prompts. Keep context. Save tokens.

Edgee compresses prompts sent to coding agents like Claude Code, Codex, and OpenCode. Up to 50% token cost savings, semantically lossless, under 15ms P50 overhead at the edge.

  • 60-80%

    tool result compression

    on typical coding agent sessions

  • <15ms

    P50 overhead

    compression time at the edge

  • 100%

    output quality

    semantically lossless on code tasks

  • 0

    code changes

    drop-in CLI wrapper

Internal benchmarks on a mixed suite of coding-agent workflows. Your mileage may vary.

How Edgee compresses prompts

Compression happens at the edge, between your coding agent and the LLM provider. We apply a multi-pass pipeline that reduces token count while preserving semantic intent.

  1. 01

    Prompt ingress

    Your Claude Code call hits the nearest Edgee edge node.

  2. 02

    Tool result compression

    File contents, grep results, shell command output, and API responses are compressed.

  3. 03

    Forward to provider

    The compressed prompt is sent to Claude / OpenAI with your original API key.

Compression is designed to be semantically lossless for code-oriented tasks. We validated this on a suite of coding benchmarks where the compressed prompt produced outputs statistically indistinguishable from the original.

Compression is not magic: extremely short prompts compress less, and some highly structured prompts (e.g., tool-use schemas) are passed through without compression to avoid altering their meaning. When in doubt, Edgee skips compression. We'd rather save you 30% reliably than 60% with occasional quality hits.

Drop-in install

Install the CLI once. Launch any supported coding agent through it. Compression runs per session.

# Install the Edgee CLI
curl -fsSL https://edgee.ai/install.sh | bash

# Launch Claude Code through the compression proxy
edgee launch claude

Full CLI guide in the Edgee documentation.

Measure every saved token

Every session reports its compression ratio, tokens saved, and estimated cost avoided.

  • Per-session compression ratio
  • Tokens saved over time
  • Cost avoided estimation

Works with your stack

Coding agent prompt compression is only useful if it fits where your prompts already live. Supported agents today, plus integration points for anything OpenAI-compatible.

Technical FAQ

Stop sending verbose prompts. Start compressing.

Works with your existing API keys and plans. No lock-in.