Skip to main content
POST
/
v1
/
messages
Create message (Anthropic format)
curl --request POST \
  --url https://api.edgee.ai/v1/messages \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "claude-sonnet-4.5",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "<string>"
    }
  ],
  "system": "<string>",
  "stream": false,
  "tools": [
    {
      "name": "<string>",
      "input_schema": {},
      "description": "<string>"
    }
  ],
  "tool_choice": {
    "type": "auto"
  }
}
'
{
  "id": "<string>",
  "model": "<string>",
  "content": [
    {
      "type": "text",
      "text": "<string>"
    }
  ],
  "usage": {
    "input_tokens": 1,
    "output_tokens": 1
  },
  "stop_reason": "end_turn"
}
Creates a message using Anthropic’s native Messages API format. This endpoint provides the same API format as Anthropic’s official API, making it easy to migrate existing integrations or use Anthropic-specific features.
This endpoint only works with the Anthropic provider. For multi-provider support, use the Chat Completions endpoint with OpenAI format.

Overview

The /v1/messages endpoint implements Anthropic’s Messages API format, which differs from the OpenAI-compatible /v1/chat/completions endpoint in several key ways:
  • Model format: Use claude-sonnet-4.5 instead of anthropic/claude-sonnet-4.5 (no provider prefix)
  • Required max_tokens: The max_tokens parameter is always required (not optional like OpenAI)
  • System prompt: System messages use a separate system field instead of being part of the messages array
  • Response format: Returns Anthropic’s native response format with different structure and field names
Use this endpoint when:
  • Migrating from Anthropic’s API to Edgee
  • Using Anthropic SDK or tools that expect native Anthropic format
  • Requiring Anthropic-specific features or response structures
For new integrations or multi-provider support, we recommend using the Chat Completions endpoint instead.

Authentication

This endpoint supports both authentication methods: Anthropic-style (preferred):
x-api-key: <api_key>
OpenAI-style (also supported):
Authorization: Bearer <api_key>
If both headers are provided, x-api-key takes precedence. See the Authentication page for more details.

Request Format

model
string
required
The model ID to use. Use Anthropic model names without provider prefix.Examples: claude-sonnet-4.5, claude-opus-4, claude-haiku-4
max_tokens
integer
required
The maximum number of tokens to generate before stopping. This parameter is required (unlike OpenAI’s API where it’s optional).Note that Claude models may stop before reaching this maximum if they hit a natural stopping point.
messages
array
required
Array of message objects representing the conversation history. Each message must have a role and content.
system
string | array
System prompt to guide the model’s behavior. This is separate from the messages array.Can be either:
  • A simple string for basic system prompts
  • An array of content blocks for structured system prompts (supports text blocks with cache_control)
stream
boolean
default:false
Whether to stream the response as Server-Sent Events (SSE). When enabled, partial message deltas are sent incrementally.Note: Streaming is only supported when using the Anthropic provider. Other providers will return an error.
tools
array
Definitions of tools the model can use. Each tool represents a function the model can call.
tool_choice
object
Controls which tool the model should use. Can be:
  • {"type": "auto"} - Model decides whether to use tools (default)
  • {"type": "any"} - Model must use one of the provided tools
  • {"type": "tool", "name": "tool_name"} - Model must use the specified tool

Response Format

Non-Streaming Response

id
string
Unique identifier for this message.
model
string
The model that was used to generate the response.
content
array
Array of content blocks in the response. Each block has a type field indicating its type.Block types include:
  • text: Text content from the model
  • tool_use: Tool call made by the model (includes id, name, and input fields)
usage
object
Token usage statistics for this request.
stop_reason
string
Why the model stopped generating. Possible values:
  • end_turn: Model reached a natural stopping point
  • max_tokens: Reached the max_tokens limit
  • tool_use: Model called a tool

Streaming Response

When stream: true, the response is sent as Server-Sent Events (SSE). Each event is prefixed with event: and data: lines. Event types:
  • message_start: Initial event with message metadata
  • content_block_start: A new content block begins
  • content_block_delta: Incremental content for the current block
  • content_block_stop: Current content block is complete
  • message_delta: Message-level delta (includes stop_reason when done)
  • message_stop: Stream is complete
  • error: An error occurred

Special Headers

X-Edgee-Enable-Compression
boolean
Enable token compression to reduce token usage. When enabled, the gateway automatically compresses your prompts to reduce costs by up to 50%.See Token Compression for more details.
X-edgee-tags
string
Comma-separated list of tags for categorizing and filtering requests in analytics and logs.Example: X-edgee-tags: production,chatbot,customer-support
X-Edgee-Debug
boolean
Enable debug mode to include additional debugging information in the response.

Examples

Basic Message

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Hello, Claude! How are you today?"
      }
    ]
  }'

Multi-Turn Conversation

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      },
      {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      {
        "role": "user",
        "content": "What is its population?"
      }
    ]
  }'

With System Prompt

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "system": "You are a helpful assistant that always responds in a friendly and concise manner.",
    "messages": [
      {
        "role": "user",
        "content": "Tell me about the solar system."
      }
    ]
  }'

Streaming

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {
        "role": "user",
        "content": "Write a haiku about coding."
      }
    ]
  }'

With Tools

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": "What is the weather in San Francisco?"
      }
    ]
  }'

With Token Compression

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -H 'X-Edgee-Enable-Compression: true' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Summarize this long document..."
      }
    ]
  }'

Error Handling

See the Errors page for details on error responses. Common errors specific to this endpoint:
  • streaming_not_supported: Streaming was requested but the model doesn’t use the Anthropic provider
  • count_tokens_not_supported: Token counting is only available with Anthropic provider

SDK Integration

For detailed examples of using this endpoint with the Anthropic SDK, see:

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your API key. More info here

Body

application/json
model
string
required

The model ID to use (Anthropic format, without provider prefix)

Example:

"claude-sonnet-4.5"

max_tokens
integer
required

Maximum number of tokens to generate

Required range: x >= 1
Example:

1024

messages
object[]
required

Array of message objects

Minimum array length: 1
system

System prompt as a string

stream
boolean
default:false

Enable streaming responses

tools
object[]

Tool definitions

tool_choice
object

Response

Message created successfully

id
string
required

Unique identifier for this message

model
string
required

The model that generated the response

content
object[]
required

Array of content blocks

usage
object
required
stop_reason
enum<string>

Why the model stopped generating

Available options:
end_turn,
max_tokens,
tool_use