Welcome to Edgee

Edgee is an AI Gateway that reduces LLM costs by up to 50% through intelligent token compression. Behind a single OpenAI-compatible API, you get access to 200+ models with automatic cost optimization, intelligent routing, and full observability.

Get Started in 6 Lines

TypeScript
Python
Go
Rust

import Edgee from 'edgee';

const edgee = new Edgee("your-api-key");

const response = await edgee.send({
  model: 'gpt-4o',
  input: 'What is the capital of France?',
});

console.log(response.text);
if (response.compression) {
  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
}

from edgee import Edgee

edgee = Edgee("your-api-key")

response = edgee.send(
    model="gpt-4o",
    input="What is the capital of France?"
)

print(response.text)
if response.compression:
    print(f"Tokens saved: {response.compression.saved_tokens}")

package main

import (
    "fmt"
    "log"
    "github.com/edgee-ai/go-sdk/edgee"
)

func main() {
    client, _ := edgee.NewClient("your-api-key")

    response, err := client.Send("gpt-4o", "What is the capital of France?")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println(response.Text())
    if response.Compression != nil {
        fmt.Printf("Tokens saved: %d\n", response.Compression.SavedTokens)
    }
}

use edgee::Edgee;

let client = Edgee::with_api_key("your-api-key");
let response = client.send("gpt-4o", "What is the capital of France?").await.unwrap();

println!("{}", response.text().unwrap_or(""));
if let Some(compression) = &response.compression {
    println!("Tokens saved: {}", compression.saved_tokens);
}

That’s it. You now have access to every major LLM provider, automatic failovers, cost tracking, and full observability, all through one simple API.

3B+ Requests/Month

Up to 50% Input Token Reduction

100+ Global PoPs

Why Choose Edgee?

Building with LLMs is powerful, but comes with challenges:

Exploding AI costs: Token usage adds up fast with RAG, long contexts, and multi-turn conversations
Cost opacity: Bills spike with no visibility into what’s driving costs
Vendor lock-in: Your code is tightly coupled to a single provider’s API
No fallbacks: When OpenAI goes down, your app goes down
Security concerns: Sensitive data flows directly to third-party providers
Fragmented observability: Logs scattered across multiple dashboards

Edgee solves all of this with a single integration.

Core Capabilities

Token Compression

Reduce prompt size by up to 50% without losing intent. Ideal for RAG, long contexts, and multi-turn agents.

Unified API

One SDK, access to 200+ models from OpenAI, Anthropic, Google, Mistral, and more. Switch providers with a single line change.

Intelligent Routing

Automatic failover, load balancing, and smart model selection. Optimize for cost, performance, or both.

Cost & Observability

Real-time cost tracking, latency metrics, and request logs. Know exactly what your AI is doing and costing.

Privacy Controls

Control how your data flows with configurable logging and retention. Enable provider-side ZDR where available, and apply privacy layers to prompts.

Introduction

Quickstart

Features

Integrations

Get Started in 6 Lines

3B+ Requests/Month

Up to 50% Input Token Reduction

100+ Global PoPs

Why Choose Edgee?

Core Capabilities

Token Compression

Unified API

Intelligent Routing

Cost & Observability

Privacy Controls

Introduction

Quickstart

Features

Integrations

​Get Started in 6 Lines

3B+ Requests/Month

Up to 50% Input Token Reduction

100+ Global PoPs

​Why Choose Edgee?

​Core Capabilities

Token Compression

Unified API

Intelligent Routing

Cost & Observability

Privacy Controls

Get Started in 6 Lines

Why Choose Edgee?

Core Capabilities