Skip to main content
The send() method is used to make non-streaming chat completion requests to the Edgee AI Gateway. It returns a SendResponse object with the model’s response.

Arguments

ParameterTypeDescription
model strThe model identifier to use (e.g., "gpt-4o")
inputstr | InputObject | dictThe input for the completion. Can be a simple string or a structured InputObject or dictionary
streamboolIf True, returns a generator yielding StreamChunk objects. If False (default), returns a SendResponse object

Input Types

String Input

When input is a string, it’s automatically converted to a user message:
response = edgee.send(
    model="gpt-4o",
    input="What is the capital of France?"
)

# Equivalent to: input={"messages": [{"role": "user", "content": "What is the capital of France?"}]}
print(response.text)
# "The capital of France is Paris."

InputObject or Dictionary

When input is an InputObject or dictionary, you have full control over the conversation:
PropertyTypeDescription
messages list[dict]Array of conversation messages
toolslist[dict] | NoneArray of function tools available to the model
tool_choicestr | dict | NoneControls which tool (if any) the model should call. See Tools documentation for details
tagslist[str] | NoneOptional tags to categorize and label the request for analytics and filtering. Can also be sent via the x-edgee-tags header (comma-separated)
enable_compressionboolEnable token compression for this request. If true, the request will be compressed to the compression rate specified in the API key settings. If false, the request will not be compressed.
compression_ratefloatThe compression rate to use for this request. If enable_compression is true, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75.
Example with Dictionary Input:
response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "user", "content": "What is 2+2?"}
        ]
    }
)

print(response.text)
# "2+2 equals 4."
Example with Tags:
response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "user", "content": "Summarize this article"}
        ],
        "tags": ["summarization", "production", "user-123"]
    }
)

Message Object

Each message in the messages array has the following structure:
PropertyTypeDescription
role strThe role of the message sender: "system", "developer", "user", "assistant", or "tool"
contentstr | NoneThe message content. Required for system, user, tool and developer roles. Optional for assistant when tool_calls is present
namestr | NoneOptional name for the message sender
tool_callslist[dict] | NoneArray of tool calls made by the assistant. Only present in assistant messages
tool_call_idstr | NoneID of the tool call this message is responding to. Required for tool role messages

Message Roles

  • system: System instructions that set the behavior of the assistant
  • developer: Instructions provided by the application developer, prioritized ahead of user messages.
  • user: Instructions provided by an end user.
  • assistant: Assistant responses (can include tool_calls)
  • tool: Results from tool/function calls (requires tool_call_id)
Example - System and User Messages:
response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is 2+2?"},
            {"role": "assistant", "content": "2+2 equals 4."},
            {"role": "user", "content": "What about 3+3?"}
        ]
    }
)

print(response.text)
# "3+3 equals 6."
For complete tool calling examples and best practices, see Tools documentation.

Return Value

The send() method returns a SendResponse object when stream=False (default):

SendResponse Object

PropertyTypeDescription
choiceslist[Choice]Array of completion choices (typically one)
usageUsage | NoneToken usage information (if provided by the API)
compressionCompression | NoneToken compression metrics (if compression was applied)

Choice Object

Each choice in the choices array contains:
PropertyTypeDescription
indexintThe index of this choice in the array
messagedictThe assistant’s message response
finish_reasonstr | NoneReason why the generation stopped. Possible values: "stop", "length", "tool_calls", "content_filter", or None
Example - Handling Multiple Choices:
response = edgee.send(
    model="gpt-4o",
    input="Give me a creative idea."
)

# Process all choices
for choice in response.choices:
    print(f"Choice {choice.index}: {choice.message.get('content')}")
    print(f"Finish reason: {choice.finish_reason}")

Message Object (in Response)

The message in each choice has:
PropertyTypeDescription
rolestrThe role of the message (typically "assistant")
contentstr | NoneThe text content of the response. None when tool_calls is present
tool_callslist[dict] | NoneArray of tool calls requested by the model (if any). See Tools documentation for details

Usage Object

Token usage information (when available):
PropertyTypeDescription
prompt_tokensintNumber of tokens in the prompt (after compression if applied)
completion_tokensintNumber of tokens in the completion
total_tokensintTotal tokens used (prompt + completion)
Example - Accessing Token Usage:
response = edgee.send(
    model="gpt-4o",
    input="Explain quantum computing briefly."
)

if response.usage:
    print(f"Prompt tokens: {response.usage.prompt_tokens}")
    print(f"Completion tokens: {response.usage.completion_tokens}")
    print(f"Total tokens: {response.usage.total_tokens}")

Compression Object

Token compression metrics (when compression is applied):
PropertyTypeDescription
input_tokensintOriginal number of input tokens before compression
saved_tokensintNumber of tokens saved by compression
ratefloatCompression rate as a decimal (0-1). For example, 0.61 means 61% compression
Example - Accessing Compression Metrics:
response = edgee.send(
    model="gpt-4o",
    input={
        "messages": [
            {"role": "user", "content": "Analyze this long document with lots of context..."}
        ],
        "enable_compression": True,
        "compression_rate": 0.8
    }
)

if response.compression:
    print(f"Original input tokens: {response.compression.input_tokens}")
    print(f"Tokens saved: {response.compression.saved_tokens}")
    print(f"Compression rate: {response.compression.rate * 100:.1f}%")
The compression object is only present when token compression is applied to the request. Simple queries may not trigger compression.

Convenience Properties

The SendResponse class provides convenience properties for easier access:
PropertyTypeDescription
textstr | NoneShortcut to choices[0].message["content"]
messagedict | NoneShortcut to choices[0].message
finish_reasonstr | NoneShortcut to choices[0].finish_reason
tool_callslist | NoneShortcut to choices[0].message.get("tool_calls")
Example - Using Convenience Properties:
response = edgee.send(
    model="gpt-4o",
    input="Hello!"
)

# Instead of: response.choices[0].message["content"]
print(response.text)

# Instead of: response.choices[0].message
print(response.message)

# Instead of: response.choices[0].finish_reason
print(response.finish_reason)

# Instead of: response.choices[0].message.get("tool_calls")
if response.tool_calls:
    print("Tool calls:", response.tool_calls)

Streaming with send()

You can use send() with stream=True to get streaming responses. This returns a generator yielding StreamChunk objects:
for chunk in edgee.send("gpt-4o", "Tell me a story", stream=True):
    if chunk.text:
        print(chunk.text, end="", flush=True)
For more details about streaming, see the Stream Method documentation.

Error Handling

The send() method can raise exceptions in several scenarios:
try:
    response = edgee.send(
        model="gpt-4o",
        input="Hello!"
    )
except RuntimeError as error:
    # API errors: "API error {status}: {message}"
    # Network errors: Standard HTTP errors
    print(f"Request failed: {error}")

Common Errors

  • API errors: RuntimeError: API error {status}: {message} - The API returned an error status
  • Network errors: Standard HTTP errors from urllib
  • Invalid input: Errors from invalid request structure