> ## Documentation Index
> Fetch the complete documentation index at: https://www.edgee.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Count Tokens

> Estimate token count for a set of messages without making an LLM call

Estimates the number of input tokens for a set of messages without sending the request to an LLM provider. Useful for pre-flight cost estimation, rate-limit planning, and prompt optimization.


## OpenAPI

````yaml POST /v1/count_tokens
openapi: 3.0.1
info:
  title: Edgee API
  version: 1.0.0
  description: >-
    Edgee is an edge-native AI Gateway with private model hosting, automatic
    model selection, cost audits/alerts, and edge tools. This API is
    OpenAI-compatible, providing one API for any model and any provider.
servers:
  - url: https://api.edgee.ai
    description: Edgee AI Gateway
security:
  - bearerAuth: []
tags:
  - name: Chat
    description: Chat completion endpoints (OpenAI format)
  - name: Messages
    description: Messages endpoints (Anthropic format)
  - name: Responses
    description: Responses endpoints (OpenAI Responses API format)
  - name: Models
    description: Model management endpoints
  - name: Tokens
    description: Token estimation endpoints
paths:
  /v1/count_tokens:
    post:
      tags:
        - Tokens
      summary: Count tokens
      description: >-
        Estimates the number of input tokens for a set of messages without
        making an LLM call. Accepts both OpenAI chat format and Anthropic
        Messages format, the format is auto-detected from the message structure.
        Useful for pre-flight cost estimation, rate-limit planning, and prompt
        optimization.


        **Note:** Token counts are approximate and may differ from
        provider-native tokenizers (e.g. OpenAI tiktoken, Anthropic's
        tokenizer).
      operationId: countTokens
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CountTokensRequest'
            example:
              model: openai/gpt-5.2
              messages:
                - role: system
                  content: You are a helpful assistant.
                - role: user
                  content: What is the capital of France?
      responses:
        '200':
          description: Token count estimated successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CountTokensResponse'
              example:
                input_tokens: 42
        '400':
          description: Bad request - invalid input parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '401':
          description: Unauthorized - missing or invalid API key
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
components:
  schemas:
    CountTokensRequest:
      type: object
      required:
        - model
      properties:
        model:
          type: string
          description: >-
            ID of the target model. Format: `{author_id}/{model_id}`. The
            gateway uses this to pick the appropriate tokenizer when `tokenizer`
            is not provided.
          example: openai/gpt-5.2
        messages:
          type: array
          description: >-
            Optional array of message objects to count tokens for. Accepts both
            OpenAI chat format (with `system`, `user`, `assistant` roles) and
            Anthropic Messages format; the format is auto-detected from the
            message structure. Defaults to an empty array.
          items:
            type: object
            required:
              - role
              - content
            properties:
              role:
                type: string
                description: The role of the message author.
              content:
                description: >-
                  The message content. Can be a plain string or an array of
                  content blocks.
                oneOf:
                  - type: string
                  - type: array
                    items:
                      type: object
            additionalProperties: true
        system:
          description: >-
            Optional system prompt. Accepts a plain string or an array of
            Anthropic content blocks. Used when counting tokens for an
            Anthropic-style request.
          oneOf:
            - type: string
            - type: array
              items:
                type: object
        tokenizer:
          type: string
          enum:
            - cl100k_base
            - o200k_base
          description: >-
            Explicit tokenizer override. When omitted, the gateway picks one
            based on `model`.
    CountTokensResponse:
      type: object
      required:
        - input_tokens
      properties:
        input_tokens:
          type: integer
          description: >-
            Estimated number of input tokens for the provided messages. This is
            an approximation, counts may differ from provider-native tokenizers.
            Use for estimation and budgeting, not exact billing.
          minimum: 0
          example: 42
    ErrorResponse:
      type: object
      required:
        - error
      description: >-
        Error response. The `error` object follows OpenAI's error envelope
        shape; the gateway additionally populates `type` (Anthropic-style
        category) and `param` when applicable.
      properties:
        error:
          type: object
          required:
            - message
          properties:
            message:
              type: string
              description: A human-readable error message.
            type:
              type: string
              enum:
                - invalid_request_error
                - authentication_error
                - permission_error
                - not_found_error
                - rate_limit_error
                - server_error
                - provider_error
              description: Anthropic-style high-level error category. Always present.
            code:
              type: string
              nullable: true
              description: >-
                A machine-readable error code. Currently emitted values:
                `unauthorized`, `forbidden`, `invalid_json`, `bad_model_id`,
                `model_not_found`, `provider_not_supported`,
                `invalid_tokenizer`, `invalid_request`, `usage_limit_exceeded`,
                `provider_error`, `internal_error`.
              example: bad_model_id
            param:
              type: string
              nullable: true
              description: >-
                Name of the request parameter that caused the error, when
                applicable.
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: JWT
      description: >-
        Bearer authentication header of the form `Bearer <token>`, where
        `<token>` is your API key. More info
        [here](/docs/api-reference/authentication)

````