Anthropic Messages

Creates a message using Anthropic’s native Messages API format. This endpoint provides the same API format as Anthropic’s official API, making it easy to migrate existing integrations or use Anthropic-specific features.

This endpoint only works with the Anthropic provider. For multi-provider support, use the Chat Completions endpoint with OpenAI format.

Overview

The /v1/messages endpoint implements Anthropic’s Messages API format, which differs from the OpenAI-compatible /v1/chat/completions endpoint in several key ways:

Model format: Use claude-sonnet-4.5 instead of anthropic/claude-sonnet-4.5 (no provider prefix)
Required max_tokens: The max_tokens parameter is always required (not optional like OpenAI)
System prompt: System messages use a separate system field instead of being part of the messages array
Response format: Returns Anthropic’s native response format with different structure and field names

Use this endpoint when:

Migrating from Anthropic’s API to Edgee
Using Anthropic SDK or tools that expect native Anthropic format
Requiring Anthropic-specific features or response structures

For new integrations or multi-provider support, we recommend using the Chat Completions endpoint instead.

Authentication

This endpoint supports both authentication methods: Anthropic-style (preferred):

x-api-key: <api_key>

OpenAI-style (also supported):

Authorization: Bearer <api_key>

If both headers are provided, x-api-key takes precedence. See the Authentication page for more details.

Request Format

model

string

required

The model ID to use. Use Anthropic model names without provider prefix.Examples: claude-sonnet-4.5, claude-opus-4, claude-haiku-4

max_tokens

integer

required

The maximum number of tokens to generate before stopping. This parameter is required (unlike OpenAI’s API where it’s optional).Note that Claude models may stop before reaching this maximum if they hit a natural stopping point.

messages

array

required

Array of message objects representing the conversation history. Each message must have a role and content.

Show Message object

role

string

required

The role of the message author. Must be either user or assistant.

content

string | array

required

The content of the message. Can be either:

A string for simple text messages
An array of content blocks for structured content (text, tool_use, tool_result)

system

string | array

System prompt to guide the model’s behavior. This is separate from the messages array.Can be either:

A simple string for basic system prompts
An array of content blocks for structured system prompts (supports text blocks with cache_control)

stream

boolean

default:false

Whether to stream the response as Server-Sent Events (SSE). When enabled, partial message deltas are sent incrementally.Note: Streaming is only supported when using the Anthropic provider. Other providers will return an error.

tools

array

Definitions of tools the model can use. Each tool represents a function the model can call.

Show Tool object

name

string

required

The name of the tool.

description

string

Description of what the tool does.

input_schema

object

required

JSON Schema object describing the tool’s input parameters.

tool_choice

object

Controls which tool the model should use. Can be:

{"type": "auto"} - Model decides whether to use tools (default)
{"type": "any"} - Model must use one of the provided tools
{"type": "tool", "name": "tool_name"} - Model must use the specified tool

Response Format

Non-Streaming Response

string

Unique identifier for this message.

model

string

The model that was used to generate the response.

content

array

Array of content blocks in the response. Each block has a type field indicating its type.Block types include:

text: Text content from the model
tool_use: Tool call made by the model (includes id, name, and input fields)

usage

object

Token usage statistics for this request.

Show Usage object

input_tokens

integer

Number of tokens in the input (prompt).

output_tokens

integer

Number of tokens in the output (completion).

stop_reason

string

Why the model stopped generating. Possible values:

end_turn: Model reached a natural stopping point
max_tokens: Reached the max_tokens limit
tool_use: Model called a tool

Streaming Response

When stream: true, the response is sent as Server-Sent Events (SSE). Each event is prefixed with event: and data: lines. Event types:

message_start: Initial event with message metadata
content_block_start: A new content block begins
content_block_delta: Incremental content for the current block
content_block_stop: Current content block is complete
message_delta: Message-level delta (includes stop_reason when done)
message_stop: Stream is complete
error: An error occurred

Special Headers

X-Edgee-Enable-Compression

boolean

Enable token compression to reduce token usage. When enabled, the gateway automatically compresses your prompts to reduce costs by up to 50%.See Token Compression for more details.

X-edgee-tags

string

Comma-separated list of tags for categorizing and filtering requests in analytics and logs.Example: X-edgee-tags: production,chatbot,customer-support

X-Edgee-Debug

boolean

Enable debug mode to include additional debugging information in the response.

Examples

Basic Message

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Hello, Claude! How are you today?"
      }
    ]
  }'

Multi-Turn Conversation

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      },
      {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      {
        "role": "user",
        "content": "What is its population?"
      }
    ]
  }'

With System Prompt

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "system": "You are a helpful assistant that always responds in a friendly and concise manner.",
    "messages": [
      {
        "role": "user",
        "content": "Tell me about the solar system."
      }
    ]
  }'

Streaming

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {
        "role": "user",
        "content": "Write a haiku about coding."
      }
    ]
  }'

With Tools

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "tools": [
      {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "input_schema": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": "What is the weather in San Francisco?"
      }
    ]
  }'

With Token Compression

curl 'https://api.edgee.ai/v1/messages' \
  -H "x-api-key: $EDGEE_API_KEY" \
  -H 'Content-Type: application/json' \
  -H 'X-Edgee-Enable-Compression: true' \
  -d '{
    "model": "claude-sonnet-4.5",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Summarize this long document..."
      }
    ]
  }'

Error Handling

See the Errors page for details on error responses. Common errors specific to this endpoint:

streaming_not_supported: Streaming was requested but the model doesn’t use the Anthropic provider
count_tokens_not_supported: Token counting is only available with Anthropic provider

Chat Completions - OpenAI-compatible endpoint with multi-provider support

SDK Integration

For detailed examples of using this endpoint with the Anthropic SDK, see:

Anthropic SDK Integration

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your API key. More info here

Body

application/json

model

string

required

The model ID to use (Anthropic format, without provider prefix)

Example:

"claude-sonnet-4.5"

max_tokens

integer

required

Maximum number of tokens to generate

Required range: x >= 1

Example:

1024

messages

object[]

required

Array of message objects

Minimum array length: 1

Show child attributes

system

System prompt as a string

stream

boolean

default:false

Enable streaming responses

tools

object[]

Tool definitions

Show child attributes

tool_choice

object

Option 1
Option 2
Option 3

Show child attributes

Response

Message created successfully

string

required

Unique identifier for this message

model

string

required

The model that generated the response

content

object[]

required

Array of content blocks

Option 1
Option 2
Option 3

Show child attributes

usage

object

required

Show child attributes

stop_reason

enum<string>

Why the model stopped generating

Available options:

end_turn,

max_tokens,

tool_use

Introduction

Endpoints

Overview

Authentication

Request Format

Response Format

Non-Streaming Response

Streaming Response

Special Headers

Examples

Basic Message

Multi-Turn Conversation

With System Prompt

Streaming

With Tools

With Token Compression

Error Handling

SDK Integration

Authorizations

Body

Response

Introduction

Endpoints

​Overview

​Authentication

​Request Format

​Response Format

​Non-Streaming Response

​Streaming Response

​Special Headers

​Examples

​Basic Message

​Multi-Turn Conversation

​With System Prompt

​Streaming

​With Tools

​With Token Compression

​Error Handling

​Related Endpoints

​SDK Integration

Authorizations

Body

Response

Overview

Authentication

Request Format

Response Format

Non-Streaming Response

Streaming Response

Special Headers

Examples

Basic Message

Multi-Turn Conversation

With System Prompt

Streaming

With Tools

With Token Compression

Error Handling

Related Endpoints

SDK Integration