Route to any model.
Govern all of them.

13 providers. Hundreds of models. One OpenAI-compatible API.

Add new providers via config, not code.

Core Providers

OpenAI

Popular Models
GPT-4o, GPT-4o-mini, o1, o3-mini
Pattern
gpt-*, o1-*, o3-*
Environment
OPENAI_API_KEY

Anthropic

Popular Models
Claude Sonnet 4, Claude Haiku 4
Pattern
claude-*
Environment
ANTHROPIC_API_KEY

Google Gemini

Popular Models
Gemini 2.0 Flash
Pattern
gemini-*
Environment
GEMINI_API_KEY

Ollama

Local
Popular Models
Llama 3, Mistral, Phi, any GGUF
Pattern
any (local)
Environment
OLLAMA_BASE_URL
No API key needed

Extended Providers

Azure OpenAI

GPT-4o, GPT-4 Turbo via Azure
AZURE_OPENAI_API_KEY

Mistral AI

mistral-large, mistral-small, codestral
MISTRAL_API_KEY

Cohere

command-r-plus, c4ai-aya
COHERE_API_KEY

DeepSeek

deepseek-chat, deepseek-coder
DEEPSEEK_API_KEY

Together AI

Meta Llama, Qwen, Mixtral
TOGETHER_API_KEY

Groq

llama-3, mixtral, gemma (ultra-fast)
GROQ_API_KEY

Fireworks AI

accounts/fireworks/* models
FIREWORKS_API_KEY

vLLM

Self-hosted
OpenAI-compatible self-hosted models
VLLM_BASE_URL

AWS Bedrock

Coming Soon
Claude, Titan, Llama via AWS Bedrock
AWS_BEDROCK_*

Built-in Routing Features

Model Aliasing

# .env
MODEL_ALIASES
=fast=gpt-4o-mini,smart=claude-sonnet

Map friendly names to any model. Change routing without code changes.

P2C Load Balancing

Error rate tracking
Latency monitoring
Pending request count

Power of Two Choices algorithm distributes traffic across providers with automatic health scoring.

Circuit Breaker + Fallback

3 failures open circuit
30s recovery window
Auto-fallback to next provider

Zero downtime for your users when a provider fails.

How It Works

# Send to any provider through one API
curl http://localhost:9080/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -H "x-tenant-id: acme" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'

# Same API, different model
curl http://localhost:9080/v1/chat/completions \
  -d '{"model": "claude-sonnet-4-20250514", ...}'

# Use an alias
curl http://localhost:9080/v1/chat/completions \
  -d '{"model": "fast", ...}'  # resolves to gpt-4o-mini

OpenAI-Compatible API

All providers are accessed through the standard /v1/chat/completions endpoint. Your existing OpenAI SDK code works without changes.

Missing Your Provider?

Gatez supports any OpenAI-compatible API. Add custom providers via environment variables.

Adding a custom provider
# Add via Control Plane API
curl -X POST http://localhost:4001/api/providers \
  -H 'Content-Type: application/json' \
  -d '{
    "id": "perplexity",
    "display_name": "Perplexity",
    "type": "openai_compatible",
    "base_url": "https://api.perplexity.ai",
    "auth_type": "bearer",
    "key_source": {
      "type": "env_var",
      "var_name": "PERPLEXITY_API_KEY"
    },
    "model_patterns": ["pplx-*", "sonar-*"],
    "default_models": ["pplx-70b-online"],
    "enabled": true
  }'

Ready to route all your models?

Deploy gatez in 5 minutes. Start routing to 13 providers through one API.