Python SDK - Send Method - Edgee documentation

The send() method is used to make non-streaming chat completion requests to the Edgee AI Gateway. It returns a SendResponse object with the model’s response.

Arguments

Parameter	Type	Description
`model`	`str`	The model identifier to use (e.g., `"gpt-4o"`)
`input`	`str \| InputObject \| dict`	The input for the completion. Can be a simple string or a structured `InputObject` or dictionary
`stream`	`bool`	If `True`, returns a generator yielding `StreamChunk` objects. If `False` (default), returns a `SendResponse` object

Input Types

String Input

When input is a string, it’s automatically converted to a user message:

response = edgee.send(
    model="gpt-4o",
    input="What is the capital of France?"
)

# Equivalent to: input={"messages": [{"role": "user", "content": "What is the capital of France?"}]}
print(response.text)
# "The capital of France is Paris."

InputObject or Dictionary

When input is an InputObject or dictionary, you have full control over the conversation:

Property	Type	Description
`messages`	`list[dict]`	Array of conversation messages
`tools`	`list[dict] \| None`	Array of function tools available to the model
`tool_choice`	`str \| dict \| None`	Controls which tool (if any) the model should call. See Tools documentation for details
`tags`	`list[str] \| None`	Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated)
`enable_compression`	`bool`	Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed.
`compression_rate`	`float`	The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75.

Example with Dictionary Input:

response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "user", "content": "What is 2+2?"}
        ]
    }
)

print(response.text)
# "2+2 equals 4."

Example with Tags:

response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "user", "content": "Summarize this article"}
        ],
        "tags": ["summarization", "production", "user-123"]
    }
)

Message Object

Each message in the messages array has the following structure:

Property	Type	Description
`role`	`str`	The role of the message sender: `"system"`, `"developer"`, `"user"`, `"assistant"`, or `"tool"`
`content`	`str \| None`	The message content. Required for `system`, `user`, `tool` and `developer` roles. Optional for `assistant` when `tool_calls` is present
`name`	`str \| None`	Optional name for the message sender
`tool_calls`	`list[dict] \| None`	Array of tool calls made by the assistant. Only present in `assistant` messages
`tool_call_id`	`str \| None`	ID of the tool call this message is responding to. Required for `tool` role messages

Message Roles

system: System instructions that set the behavior of the assistant
developer: Instructions provided by the application developer, prioritized ahead of user messages.
user: Instructions provided by an end user.
assistant: Assistant responses (can include tool_calls)
tool: Results from tool/function calls (requires tool_call_id)

Example - System and User Messages:

response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is 2+2?"},
            {"role": "assistant", "content": "2+2 equals 4."},
            {"role": "user", "content": "What about 3+3?"}
        ]
    }
)

print(response.text)
# "3+3 equals 6."

For complete tool calling examples and best practices, see Tools documentation.

Return Value

The send() method returns a SendResponse object when stream=False (default):

SendResponse Object

Property	Type	Description
`choices`	`list[Choice]`	Array of completion choices (typically one)
`usage`	`Usage \| None`	Token usage information (if provided by the API)
`compression`	`Compression \| None`	Token compression metrics (if compression was applied)

Choice Object

Each choice in the choices array contains:

Property	Type	Description
`index`	`int`	The index of this choice in the array
`message`	`dict`	The assistant’s message response
`finish_reason`	`str \| None`	Reason why the generation stopped. Possible values: `"stop"`, `"length"`, `"tool_calls"`, `"content_filter"`, or `None`

Example - Handling Multiple Choices:

response = edgee.send(
    model="gpt-4o",
    input="Give me a creative idea."
)

# Process all choices
for choice in response.choices:
    print(f"Choice {choice.index}: {choice.message.get('content')}")
    print(f"Finish reason: {choice.finish_reason}")

Message Object (in Response)

The message in each choice has:

Property	Type	Description
`role`	`str`	The role of the message (typically `"assistant"`)
`content`	`str \| None`	The text content of the response. `None` when `tool_calls` is present
`tool_calls`	`list[dict] \| None`	Array of tool calls requested by the model (if any). See Tools documentation for details

Usage Object

Token usage information (when available):

Property	Type	Description
`prompt_tokens`	`int`	Number of tokens in the prompt (after compression if applied)
`completion_tokens`	`int`	Number of tokens in the completion
`total_tokens`	`int`	Total tokens used (prompt + completion)

Example - Accessing Token Usage:

response = edgee.send(
    model="gpt-4o",
    input="Explain quantum computing briefly."
)

if response.usage:
    print(f"Prompt tokens: {response.usage.prompt_tokens}")
    print(f"Completion tokens: {response.usage.completion_tokens}")
    print(f"Total tokens: {response.usage.total_tokens}")

Compression Object

Token compression metrics (when compression is applied):

Property	Type	Description
`input_tokens`	`int`	Original number of input tokens before compression
`saved_tokens`	`int`	Number of tokens saved by compression
`rate`	`float`	Compression rate as a decimal (0-1). For example, `0.61` means 61% compression

Example - Accessing Compression Metrics:

response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "user", "content": "Analyze this long document with lots of context..."}
        ],
        "enable_compression": True,
        "compression_rate": 0.8
    }
)

if response.compression:
    print(f"Original input tokens: {response.compression.input_tokens}")
    print(f"Tokens saved: {response.compression.saved_tokens}")
    print(f"Compression rate: {response.compression.rate * 100:.1f}%")

The compression object is only present when token compression is applied to the request. Simple queries may not trigger compression.

Convenience Properties

The SendResponse class provides convenience properties for easier access:

Property	Type	Description
`text`	`str \| None`	Shortcut to `choices[0].message["content"]`
`message`	`dict \| None`	Shortcut to `choices[0].message`
`finish_reason`	`str \| None`	Shortcut to `choices[0].finish_reason`
`tool_calls`	`list \| None`	Shortcut to `choices[0].message.get("tool_calls")`

Example - Using Convenience Properties:

response = edgee.send(
    model="gpt-4o",
    input="Hello!"
)

# Instead of: response.choices[0].message["content"]
print(response.text)

# Instead of: response.choices[0].message
print(response.message)

# Instead of: response.choices[0].finish_reason
print(response.finish_reason)

# Instead of: response.choices[0].message.get("tool_calls")
if response.tool_calls:
    print("Tool calls:", response.tool_calls)

Streaming with send()

You can use send() with stream=True to get streaming responses. This returns a generator yielding StreamChunk objects:

for chunk in edgee.send("gpt-4o", "Tell me a story", stream=True):
    if chunk.text:
        print(chunk.text, end="", flush=True)

For more details about streaming, see the Stream Method documentation.

Error Handling

The send() method can raise exceptions in several scenarios:

try:
    response = edgee.send(
        model="gpt-4o",
        input="Hello!"
    )
except RuntimeError as error:
    # API errors: "API error {status}: {message}"
    # Network errors: Standard HTTP errors
    print(f"Request failed: {error}")

Common Errors

API errors: RuntimeError: API error {status}: {message} - The API returned an error status
Network errors: Standard HTTP errors from urllib
Invalid input: Errors from invalid request structure

SDK Documentation

​Arguments

​Input Types

​String Input

​InputObject or Dictionary

​Message Object

​Message Roles

​Return Value

​SendResponse Object

​Choice Object

​Message Object (in Response)

​Usage Object

​Compression Object

​Convenience Properties

​Streaming with send()

​Error Handling

​Common Errors

Arguments

Input Types

String Input

InputObject or Dictionary

Message Object

Message Roles

Return Value

SendResponse Object

Choice Object

Message Object (in Response)

Usage Object

Compression Object

Convenience Properties

Streaming with send()

Error Handling

Common Errors