Use this file to discover all available pages before exploring further.
The stream() method is used to make streaming chat completion requests to the Edgee AI Gateway. It returns a generator that yields StreamChunk objects as they arrive from the API.
When input is a string, it’s automatically converted to a user message:
for chunk in edgee.stream("gpt-5.2", "Tell me a story"): if chunk.text: print(chunk.text, end="", flush=True) if chunk.finish_reason: print(f"\nFinished: {chunk.finish_reason}")# Equivalent to: input={"messages": [{"role": "user", "content": "Tell me a story"}]}
When input is an InputObject or dictionary, you have full control over the conversation:
Property
Type
Description
messages
list[dict]
Array of conversation messages
tools
list[dict] | None
Array of function tools available to the model
tool_choice
str | dict | None
Controls which tool (if any) the model should call. See Tools documentation for details
tags
list[str] | None
Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the x-edgee-tags header (comma-separated)
compression_model
str
Compression model for this request: "agentic", "claude", "opencode", "cursor", or "customer". Each model is a bundle of compression strategies. Overrides API key settings when present.
compression_configuration
dict
Configuration for the compression model. Currently only available for agentic. Contains optional rate (0.0-1.0, default 0.8) and semantic_preservation_threshold (0-100).
for chunk in edgee.stream("gpt-5.2", { "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a poem about coding"} ]}): if chunk.text: print(chunk.text, end="", flush=True)
Reason why the generation stopped. Only present in the final chunk. Possible values: "stop", "length", "tool_calls", "content_filter", or None
Example - Handling Multiple Choices:
for chunk in edgee.stream("gpt-5.2", "Give me creative ideas"): for choice in chunk.choices: if choice.delta.content: print(f"Choice {choice.index}: {choice.delta.content}")
The StreamChunk class provides convenience properties for easier access:
Property
Type
Description
text
str | None
Shortcut to choices[0].delta.content - the incremental text content
role
str | None
Shortcut to choices[0].delta.role - the message role (first chunk only)
finish_reason
str | None
Shortcut to choices[0].finish_reason - the finish reason (final chunk only)
Example - Using Convenience Properties:
for chunk in edgee.stream("gpt-5.2", "Explain quantum computing"): # Content chunks if chunk.text: print(chunk.text, end="", flush=True) # First chunk contains the role if chunk.role: print(f"\nRole: {chunk.role}") # Last chunk contains finish reason if chunk.finish_reason: print(f"\nFinish reason: {chunk.finish_reason}")
Final chunk: Contains finish_reason indicating why generation stopped
Example - Collecting Full Response:
full_text = ""for chunk in edgee.stream("gpt-5.2", "Tell me a story"): if chunk.text: full_text += chunk.text print(chunk.text, end="", flush=True) # Also display as it streamsprint(f"\n\nFull response ({len(full_text)} characters):")print(full_text)
Some chunks may not contain content. This is normal and can happen when:
The chunk only contains metadata (role, finish_reason)
The chunk is part of tool call processing
Network buffering creates empty chunks
Always check for chunk.text before using it:
for chunk in edgee.stream("gpt-5.2", "Hello"): if chunk.text: # ✅ Good: Check before using print(chunk.text) # ❌ Bad: print(chunk.text) - may print None
try: for chunk in edgee.stream("gpt-5.2", "Hello!"): if chunk.text: print(chunk.text, end="", flush=True)except RuntimeError as error: # API errors: "API error {status}: {message}" # Network errors: Standard HTTP errors print(f"Stream failed: {error}")