Compress
Compress a request payload without calling any LLM provider
messages, input, system, and tools fields replaced by their compressed versions. No LLM call is made — this is a pure pre-processing step.
Intended for teams running their own LLM gateways (Oracle, Vercel, Cloudflare) who want Edgee token compression without routing requests through Edgee.Authorizations
Body
- Option 1
- Option 2
- Option 3
An LLM request payload to compress. Accepts any of three wire formats — the format is auto-detected from the request body: a top-level "system" key indicates Anthropic Messages format; a top-level "input" key indicates OpenAI Responses API format; otherwise, OpenAI Chat Completions format is assumed.
ID of the model to use. Format: {author_id}/{model_id} (e.g. openai/gpt-5.2)
"openai/gpt-5.2"
A list of messages comprising the conversation so far.
1The maximum number of tokens that can be generated in the chat completion.
x >= 1If set, partial message deltas will be sent, as in OpenAI. Streamed chunks are sent as Server-Sent Events (SSE).
Options for streaming response.
A list of tools the model may call. Currently, only function type is supported.
Controls which tool (if any) the model is allowed to call. Accepts a bare string (none / auto), a typed-mode object ({ "type": "auto" | "none" }), or a specific function reference.
none, auto List of Edge Tool IDs to inject (e.g. edgee_current_time, edgee_generate_uuid). Each ID must be activated for your API key. When omitted or empty, only tools with hydration enabled for your org or API key are auto-injected. Invalid or non-activated IDs return 400 with invalid_edgee_tool_ids.
["edgee_current_time", "edgee_generate_uuid"]Pending operation ID when continuing a conversation after Edge Tool execution (e.g. when mixing client-side and Edge Tools). The gateway injects stored Edge Tool results into the message history.
Optional tags to categorize and label the request. Useful for filtering and grouping requests in analytics and logs. Can also be sent via the x-edgee-tags header as a comma-separated string.
When true, the response includes additional debug information. Equivalent to the X-Edgee-Debug header.
Selects the compression bundle to apply to the request. Equivalent to the X-Edgee-Compression-Model header.
claude, opencode, codex, cursor Response
Request compressed successfully. The response mirrors the input format with the messages, input, system, and tools fields replaced by their compressed versions. All other fields pass through unchanged. A compression metadata object is always appended.
The original request payload with compressed content fields replaced. All fields not touched by compression (model, temperature, top_p, stop_sequences, etc.) pass through unchanged. A compression object is always appended.
Token compression metrics appended to every /v1/compress response.