Edgee is an edge-native AI gateway that optimizes LLM costs through token compression, intelligent routing, and edge processing. It provides one OpenAI-compatible API to connect to 200+ models while reducing costs by up to 50%.

Your app calls Edgee with a standard OpenAI-compatible API. Edgee compresses prompts at the edge to reduce token usage, routes to cost-efficient models, and applies intelligent policies before forwarding to LLM providers—all while tracking real-time cost savings.

Can I use my own provider API keys?

Yes. You can use Edgee’s unified access with a single Edgee API key, or bring your own provider keys for direct billing and custom models.

What do I get with Edgee?

Up to 50% cost reduction through token compression, one OpenAI-compatible API for 200+ models, intelligent cost-aware routing, real-time savings tracking, and edge-level capabilities—with instant ROI from day one.

Does Edgee work with Claude Code?

Yes. Install the Edgee CLI and run `edgee launch claude` to start Claude Code with transparent token compression. No code changes required — Edgee acts as a proxy, compressing prompts before they reach Anthropic. Most users save 20–50% on token costs immediately.

Which coding agents does Edgee support?

Edgee supports Claude Code, Cursor, OpenCode, Codex, and any coding agent that uses standard LLM API calls. The Edgee CLI wraps your agent transparently — compression happens at the network layer.

Token Compression Gateway for your agents

Name: Edgee AI Gateway
Author: Edgee

Edgee compresses prompts before they reach LLM providers.
Same code, fewer tokens, lower bills.

Up to 50%Cost reduction

26.5%Longer coding sessions

1 minTo install

edgee — zsh

❯

How to use Edgee

Whether you’re using a coding agent or building an app, Edgee compresses your LLM traffic in minutes.

For coding agents

Start saving tokens in 1 minute

Install Edgee CLI and connect it to your coding agent. No code changes required.

No code changes: works as a transparent proxy for your agent
Instant savings: token compression kicks in on the first request
Works with any agent: Claude Code, Codex, Cursor and more

Configure your coding agent

Connect Edgee to your AI coding assistant and start saving tokens in 1 minute.

1Choose your coding agent

2Install Edgee

$curl -fsSL https://install.edgee.ai | bash

3Start saving tokens

$edgee launch claude

Why Edgee AI Gateway?

An edge intelligence layer for your AI traffic

Edgee sits between your application and LLM providers behind a single OpenAI-compatible API. It adds edge-level intelligence, including token compression, routing policies, cost controls, private models, and tools, so you can ship AI features faster and with confidence.

Token compression

Reduce prompt size without losing intent to lower costs and latency, especially for long contexts, RAG pipelines, and multi-turn agents.

Learn more

Edge Tools

Invoke shared tools managed by Edgee, or deploy your own private tools at the edge, closer to users and providers for lower latency and tighter control.

Learn more

Bring Your Own Keys

Use Edgee’s keys for convenience, or plug in your own provider keys for billing control and custom models.

Learn more

Observability

Monitor latency, errors, usage, and cost per model, per app, and per environment.

Learn more

Private Models

Deploy serverless open-source LLMs on demand, where you need them, and expose them through the same gateway API alongside public providers.

Learn more

The vision behind Edgee

Every technological shift creates a new foundation: the web had bandwidth, the cloud had compute, and AI has tokens. In a world powered by models, intelligence has a cost: tokens flow through every interaction, decision, and response.

At Edgee, we believe intelligence should move efficiently, closer to users, intent, and action. It should be compressed, routed, and optimized so decisions happen instantly. Hear from Sacha, Edgee’s co-founder, on how AI scales by mastering how intelligence moves.