Edge-native token compression
Compress prompts. Keep context. Save tokens.
Edgee compresses prompts sent to coding agents like Claude Code, Codex, and OpenCode. Up to 50% token cost savings, semantically lossless, under 15ms P50 overhead at the edge.
#
60-80%
tool result compression
on typical coding agent sessions
<15ms
P50 overhead
compression time at the edge
100%
output quality
semantically lossless on code tasks
0
code changes
drop-in CLI wrapper
Internal benchmarks on a mixed suite of coding-agent workflows. Your mileage may vary.
How Edgee compresses prompts
Compression happens at the edge, between your coding agent and the LLM provider. We apply a multi-pass pipeline that reduces token count while preserving semantic intent.
- 01
Prompt ingress
Your Claude Code call hits the nearest Edgee edge node.
- 02
Tool result compression
File contents, grep results, shell command output, and API responses are compressed.
- 03
Forward to provider
The compressed prompt is sent to Claude / OpenAI with your original API key.
Compression is designed to be semantically lossless for code-oriented tasks. We validated this on a suite of coding benchmarks where the compressed prompt produced outputs statistically indistinguishable from the original.
Compression is not magic: extremely short prompts compress less, and some highly structured prompts (e.g., tool-use schemas) are passed through without compression to avoid altering their meaning. When in doubt, Edgee skips compression. We'd rather save you 30% reliably than 60% with occasional quality hits.
Drop-in install
Install the CLI once. Launch any supported coding agent through it. Compression runs per session.
# Install the Edgee CLI
curl -fsSL https://edgee.ai/install.sh | bash
# Launch Claude Code through the compression proxy
edgee launch claude
Full CLI guide in the Edgee documentation.
Measure every saved token
Every session reports its compression ratio, tokens saved, and estimated cost avoided.
- Per-session compression ratio
- Tokens saved over time
- Cost avoided estimation
Compression ratio
39%
avg, last 30 days
Cost avoided
$142
3.8M tokens saved
Works with your stack
Coding agent prompt compression is only useful if it fits where your prompts already live. Supported agents today, plus integration points for anything OpenAI-compatible.
Claude Code
Compression applied to every prompt sent to Anthropic. Full CLAUDE.md + MCP compatibility.
OpenAI Codex
Compresses requests to Codex models while preserving tool-use schemas.
OpenCode
Compression runs transparently on every OpenCode session.
Cursor
Cursor integration is in development. Join the waitlist on the coding-agents page.
OpenClaw
Integration is in development.
Custom OpenAI-compatible clients
Point any OpenAI-compatible SDK at the Edgee endpoint. Compression applies automatically.
Technical FAQ
Stop sending verbose prompts. Start compressing.
Works with your existing API keys and plans. No lock-in.