POST /v1/responses). This endpoint is compatible with OpenAI’s Responses API, making it easy to use with tools like the Codex CLI.
Overview
The/v1/responses endpoint implements OpenAI’s Responses API format, which differs from the Chat Completions endpoint in several ways:
- Input: Accepts either a plain string or a flat array of typed input items (messages, tool calls, tool outputs)
- Tools format: Flat structure
{"type":"function","name":"...","description":"...","parameters":{...}}instead of the nested{"type":"function","function":{...}}used by Chat Completions - Output: Returns an
outputarray of typed items instead of achoicesarray - Instructions: Supports a separate
instructionsfield as an alternative to a system message in the input array
- Using tools or SDKs that target the OpenAI Responses API (e.g. Codex CLI)
- Building agentic workflows that pass tool calls and tool outputs in a flat item array
Authentication
Request Format
The model ID to use, with provider prefix.Examples:
openai/gpt-4o, anthropic/claude-sonnet-4-6, google/gemini-2.0-flashThe input to the model. Either:
- A plain string (treated as a single user message)
- An array of input items (messages, function calls, function call outputs)
System-level instruction prepended to the conversation. An alternative to including a
system role message in the input array.Whether to stream the response as Server-Sent Events (SSE).
Maximum number of tokens to generate.
Tools available to the model. Uses the Responses API flat format (no nested
function key).Controls tool selection:
"auto"— model decides whether to call a tool (default)"none"— model must not call any tool"required"— model must call a tool{"type": "function", "name": "tool_name"}— model must call the specified tool
Sampling temperature between 0 and 2. Higher values produce more random outputs.
Nucleus sampling probability. Alternative to temperature.
Edgee Extensions
List of string tags for categorizing and filtering requests in analytics and logs.
Enable debug mode to include additional information in the response.
Model to use for token compression. See Token Compression.
List of Edgee-managed tool IDs to include automatically.
Response Format
Non-Streaming Response
Unique identifier for the response, prefixed with
resp_.Always
"response".Always
"completed" for non-streaming responses.Unix timestamp (as a float) of when the response was created.
The model used to generate the response.
Array of output items produced by the model.
Token usage statistics.
Streaming Response
Whenstream: true, the response is sent as Server-Sent Events (SSE). Each event is a JSON object with a type field.
Event sequence for a text response:
| Event type | Description |
|---|---|
response.created | Stream opened; initial response object with status: "in_progress" |
response.output_item.added | A new output item started (message or function call) |
response.content_part.added | A new content part started within an output item |
response.output_text.delta | Incremental text chunk |
response.output_text.done | Text content complete |
response.content_part.done | Content part complete |
response.output_item.done | Output item complete |
response.completed | Stream complete; final response object with usage |
| Event type | Description |
|---|---|
response.output_item.added | New function call item started |
response.function_call_arguments.delta | Incremental arguments chunk |
response.function_call_arguments.done | Arguments complete |
response.output_item.done | Function call item complete |
Special Headers
Enable token compression to reduce token usage. See Token Compression.
Comma-separated list of tags for analytics and logs.Example:
X-edgee-tags: production,agent,codexEnable debug mode for additional response information.
Examples
Basic Text Input
With Instructions and Message Array
Streaming
With Tools
Multi-Turn with Tool Results
Error Handling
See the Errors page for details on error responses.Related Endpoints
- Chat Completions — OpenAI-compatible endpoint for multi-provider support
- Anthropic Messages — Native Anthropic Messages API format