Documentation Index
Fetch the complete documentation index at: https://www.edgee.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.

| Strategy | What it does for Codex | Default | Customer-traffic average |
|---|---|---|---|
| Tool Result | Trims tool-call outputs (file reads, shell commands, search results) before they reach the model. Lossless. | ✅ on | −19% |
| Tool Surface (alpha) | Drops MCP servers, skills, and tools irrelevant to the current task before the request hits the model. | ⚠️ opt-in | ~−25% projected |
| Output | Reduces verbosity of model responses without losing technical content. Same answer, fewer tokens. | ⚪ opt-in | −6.5% when enabled |
Tool Result Trimming
tool_result_trimming filters the tool-call outputs Codex receives — file reads, shell commands, search results — before they reach the model. Lossless on tool-result payloads. User messages and assistant turns are not modified.
→ Full strategy reference: Token Compression / Tool Result Trimming.
Tool Surface Reduction (alpha)
tool_surface_reduction strips out the MCP servers, skills, and tools Codex wouldn’t use for the current task. The IDE still exposes everything; the model only ever sees the relevant subset.
→ Full strategy reference: Token Compression / Tool Surface Reduction.
Output Brevity (by Caveman)
output_brevity reduces the verbosity of Codex’s responses. Three strategies are available (light, medium, hard). Off by default for Codex sessions because output is a small share (~1%) of total volume — turn it on if your Codex workflow leans heavy on long-form responses.
→ Full strategy reference: Token Compression / Output Brevity.
Receipts
−49.5% fresh input tokens (1.14M → 574K per session). −35.6% total session cost (2.58). Cache hit rate 76% → 85%.
Source: edgee-ai/compression-lab · Stop paying Codex to re-read context
Get started
- macOS / Linux
- Homebrew
- Windows (PowerShell)
CLI guide
Install, authenticate, and launch Codex in under a minute.
Codex-specific: OpenAI Responses wire format
Codex uses the OpenAIresponses wire API. When routing through Edgee, the CLI automatically sets the correct provider config in ~/.codex/config.toml:
edgee launch codex. You never need to edit this file manually.
Toggling individual strategies
In the Edgee Console, open Dashboard and manage your Codex’s settings right from the UI.- Enable
tool_surface_reductionto opt into the alpha tool-surface compression. - Enable
output_brevityif your Codex workflow produces long-form output worth tightening. - Disable
tool_result_trimmingonly when you want to compare against an uncompressed baseline.
Manual setup (advanced)
Manual setup (advanced)
To configure Codex without the CLI, paste the config above into
~/.codex/config.toml and replace <YOUR_EDGEE_API_KEY> with your key from the Edgee Console. Then enable the strategies you want from the Edge Models section.Lossiness
tool_result_trimming is lossless on tool-result payloads. tool_surface_reduction is lossless on the model’s perspective of available tools. output_brevity is not lossless on the prose dimension — it intentionally compresses prose verbosity. Across active customers (rolling 30 days), aggregate token bills are reduced by approximately 20% with zero measurable drift on SWE-Bench Verified samples.
Next
Token Compression
Deep dive on each strategy.
Claude Code Compression
Same three strategies, tuned for Claude Code.