Skip to main content
Edgee is an AI Gateway that reduces LLM costs by up to 50% through intelligent token compression. Behind a single OpenAI-compatible API, you get access to 200+ models with automatic cost optimization, intelligent routing, and full observability.

Get Started in 6 Lines

import Edgee from 'edgee';

const edgee = new Edgee("your-api-key");

const response = await edgee.send({
  model: 'gpt-5.2',
  input: 'What is the capital of France?',
});

console.log(response.text);
if (response.compression) {
  console.log(`Tokens saved: ${response.compression.saved_tokens}`);
}
That’s it. You now have access to every major LLM provider, automatic failovers, cost tracking, and full observability, all through one simple API. Edgee AI Gateway

3B+ Requests/Month

Up to 50% Input Token Reduction

100+ Global PoPs

Why Choose Edgee?

Building with LLMs is powerful, but comes with challenges:
  • Exploding AI costs: Token usage adds up fast, whether you’re running RAG pipelines, coding with Claude Code, or building multi-turn agents
  • Cost opacity: Bills spike with no visibility into what’s driving costs
  • Vendor lock-in: Your code is tightly coupled to a single provider’s API
  • No fallbacks: When OpenAI goes down, your app goes down
  • Security concerns: Sensitive data flows directly to third-party providers
  • Fragmented observability: Logs scattered across multiple dashboards
Edgee solves all of this with a single integration.

Core Capabilities

https://mintcdn.com/edgee/RmPUqoqJw-u0FxFP/images/icons/agentic-comp.svg?fit=max&auto=format&n=RmPUqoqJw-u0FxFP&q=85&s=16ad50452d161326268839855fb35832

Token Compression for Agentic Workloads

AI-powered context optimization that reduces token usage. Perfect for long-context prompts and agentic workloads where context windows matter.
https://mintcdn.com/edgee/RmPUqoqJw-u0FxFP/images/icons/claude.svg?fit=max&auto=format&n=RmPUqoqJw-u0FxFP&q=85&s=d3154991b618d253ee22ffaf55a433fc

Token Compression for Claude Code

Lossless compression for Claude Code, extending your plan’s session duration by 3 times.

Cost & Observability

Real-time cost tracking, latency metrics, and request logs. Know exactly what your AI is doing and costing.

Unified API

One SDK, access to 200+ models from OpenAI, Anthropic, Google, Mistral, and more. Switch providers with a single line change.