Provider Usage
cliproxyapi++ routes OpenAI-style requests to many provider backends through a unified auth and translation layer.
This page covers provider strategy and high-signal setup patterns. For full block-by-block coverage, use Provider Catalog.
Audience Guidance
- Use this page if you manage provider credentials and model routing.
- Use Routing and Models Reference for selection behavior details.
- Use Troubleshooting for runtime failure triage.
Provider Categories
- Direct APIs: Claude, Gemini, OpenAI, Mistral, Groq, DeepSeek.
- Aggregators: OpenRouter, Together AI, Fireworks AI, Novita AI, SiliconFlow.
- Proprietary/OAuth flows: Kiro, GitHub Copilot, Roo Code, Kilo AI, MiniMax.
Naming and Metadata Conventions
- Use canonical provider keys in config and ops docs (
github-copilot,antigravity,claude,codex). - Keep user-facing aliases stable and provider-agnostic where possible (for example
claude-sonnet-4-6), and map upstream-specific names throughoauth-model-alias. - For GitHub Copilot, treat it as a distinct provider channel (
github-copilot), not a generic "microsoft account" channel. Account eligibility still depends on Copilot plan entitlements.
Provider-First Architecture
cliproxyapi++ keeps one client-facing API (/v1/*) and pushes provider complexity into configuration:
- Inbound auth is validated from top-level
api-keys. - Model names are resolved by prefix + alias.
- Routing selects provider/credential based on eligibility.
- Upstream call is translated and normalized back to OpenAI-compatible output.
This lets clients stay stable while provider strategy evolves independently.
Common Configuration Pattern
Use provider-specific blocks in config.yaml:
# Client API auth for /v1/*
api-keys:
- "prod-client-key"
# One direct provider
claude-api-key:
- api-key: "sk-ant-xxxx"
prefix: "claude-prod"
# One OpenAI-compatible aggregator
openai-compatibility:
- name: "openrouter"
prefix: "or"
base-url: "https://openrouter.ai/api/v1"
api-key-entries:
- api-key: "sk-or-v1-xxxx"MLX and vLLM-MLX Pattern
For MLX servers that expose OpenAI-compatible APIs (for example mlx-openai-server and vllm-mlx), configure them under openai-compatibility:
openai-compatibility:
- name: "mlx-local"
prefix: "mlx"
base-url: "http://127.0.0.1:8000/v1"
api-key-entries:
- api-key: "dummy-or-local-key"Then request models through the configured prefix (for example mlx/<model-id>), same as other OpenAI-compatible providers.
Requesting Models
Call standard OpenAI-compatible endpoints:
curl -sS -X POST http://localhost:8317/v1/chat/completions \
-H "Authorization: Bearer prod-client-key" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-prod/claude-3-5-sonnet",
"messages": [{"role":"user","content":"Summarize this repository"}],
"stream": false
}'Prefix behavior depends on your prefix + force-model-prefix settings.
Production Routing Pattern
Use this default design in production:
- Primary direct provider for predictable latency.
- Secondary aggregator provider for breadth/failover.
- Prefix isolation by workload (for example
agent-core/*,batch/*). - Explicit alias map for client-stable model names.
Example:
force-model-prefix: true
claude-api-key:
- api-key: "sk-ant-..."
prefix: "agent-core"
models:
- name: "claude-3-5-sonnet-20241022"
alias: "core-sonnet"
openrouter:
- api-key: "sk-or-v1-..."
prefix: "batch"Verify Active Model Inventory
curl -sS http://localhost:8317/v1/models \
-H "Authorization: Bearer prod-client-key" | jq '.data[].id' | headIf a model is missing, verify provider block, credential validity, and prefix constraints.
Rotation and Multi-Credential Guidance
- Add multiple keys per provider to improve resilience.
- Use prefixes to isolate traffic by team or workload.
- Monitor
429patterns and redistribute traffic before hard outage. - Keep at least one fallback provider for every critical workload path.
Failure Modes and Fixes
- Upstream
401/403: provider key invalid or expired. - Frequent
429: provider quota/rate limit pressure; add keys/providers. - Unexpected provider choice: model prefix mismatch or alias overlap.
- Provider appears unhealthy: inspect operations endpoints and logs.
Provider Quickstarts
Prefer the 5-minute reference flows in: