Core Providers

OpenAI

Popular Models

GPT-4o, GPT-4o-mini, o1, o3-mini

Pattern

gpt-*, o1-*, o3-*

Environment

OPENAI_API_KEY

Anthropic

Popular Models

Claude Sonnet 4, Claude Haiku 4

Pattern

claude-*

Environment

ANTHROPIC_API_KEY

Google Gemini

Popular Models

Gemini 2.0 Flash

Pattern

gemini-*

Environment

GEMINI_API_KEY

Ollama

Local

Popular Models

Llama 3, Mistral, Phi, any GGUF

Pattern

any (local)

Environment

OLLAMA_BASE_URL

No API key needed

Extended Providers

Azure OpenAI

GPT-4o, GPT-4 Turbo via Azure

AZURE_OPENAI_API_KEY

Mistral AI

mistral-large, mistral-small, codestral

MISTRAL_API_KEY

Cohere

command-r-plus, c4ai-aya

COHERE_API_KEY

DeepSeek

deepseek-chat, deepseek-coder

DEEPSEEK_API_KEY

Together AI

Meta Llama, Qwen, Mixtral

TOGETHER_API_KEY

Groq

llama-3, mixtral, gemma (ultra-fast)

GROQ_API_KEY

Fireworks AI

accounts/fireworks/* models

FIREWORKS_API_KEY

vLLM

Self-hosted

OpenAI-compatible self-hosted models

VLLM_BASE_URL

AWS Bedrock

Coming Soon

Claude, Titan, Llama via AWS Bedrock

AWS_BEDROCK_*

Built-in Routing Features

Model Aliasing

# .env

MODEL_ALIASES

=fast=gpt-4o-mini,smart=claude-sonnet

Map friendly names to any model. Change routing without code changes.

P2C Load Balancing

Error rate tracking

Latency monitoring

Pending request count

Power of Two Choices algorithm distributes traffic across providers with automatic health scoring.

Circuit Breaker + Fallback

3 failures open circuit

30s recovery window

Auto-fallback to next provider

Zero downtime for your users when a provider fails.

How It Works

# Send to any provider through one API
curl http://localhost:9080/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "x-tenant-id: acme" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

# Same API, different model
curl http://localhost:9080/v1/chat/completions \
  -d '{"model": "claude-sonnet-4-20250514", ...}'

# Use an alias
curl http://localhost:9080/v1/chat/completions \
  -d '{"model": "fast", ...}'  # resolves to gpt-4o-mini

OpenAI-Compatible API

All providers are accessed through the standard /v1/chat/completions endpoint. Your existing OpenAI SDK code works without changes.

Missing Your Provider?

Gatez supports any OpenAI-compatible API. Add custom providers via environment variables.

Adding a custom provider

# Add via Control Plane API
curl -X POST http://localhost:4001/api/providers \
  -H 'Content-Type: application/json' \
  -d '{
    "id": "perplexity",
    "display_name": "Perplexity",
    "type": "openai_compatible",
    "base_url": "https://api.perplexity.ai",
    "auth_type": "bearer",
    "key_source": {
      "type": "env_var",
      "var_name": "PERPLEXITY_API_KEY"
    },
    "model_patterns": ["pplx-*", "sonar-*"],
    "default_models": ["pplx-70b-online"],
    "enabled": true
  }'

Provider Documentation Request a Provider

Ready to route all your models?

Deploy gatez in 5 minutes. Start routing to 13 providers through one API.

Get Started View on GitHub

Route to any model.Govern all of them.

Core Providers

OpenAI

Anthropic

Google Gemini

Ollama

Extended Providers

Azure OpenAI

Mistral AI

Cohere

DeepSeek

Together AI

Groq

Fireworks AI

vLLM

AWS Bedrock

Built-in Routing Features

Model Aliasing

P2C Load Balancing

Circuit Breaker + Fallback

How It Works

OpenAI-Compatible API

Missing Your Provider?

Ready to route all your models?

Route to any model.
Govern all of them.