Routing and Models Reference
This page explains how cliproxyapi++ selects credentials/providers and resolves model names.
Audience Guidance
- Platform operators tuning reliability and quota usage.
- Developers debugging model resolution and fallback behavior.
Request Flow
- Client sends an OpenAI-compatible request to
/v1/*. - API key auth is checked (
Authorization: Bearer <client-key>). - Model name is resolved against configured providers, prefixes, and aliases.
- Credential/provider is chosen by routing strategy.
- Upstream request is translated and executed.
- Response is normalized back to OpenAI-compatible JSON/SSE.
Endpoint behavior note:
- For Copilot Codex-family models (
*codex*, includinggpt-5.1-codex-mini), route through/v1/responses. - For non-Codex Copilot and most other providers,
/v1/chat/completionsremains the default path.
Routing Controls in config.yaml
yaml
routing:
strategy: "round-robin" # round-robin | fill-first
force-model-prefix: false
request-retry: 3
max-retry-interval: 30
quota-exceeded:
switch-project: true
switch-preview-model: trueNotes:
quota-exceeded.switch-projectandquota-exceeded.switch-preview-modelare the current built-in automatic quota fallback controls.- There is no generic per-provider auto-disable/auto-enable scheduler yet; for Gemini keys, use model exclusions/aliases plus these fallback toggles.
Model Prefix and Alias Behavior
- A credential/provider prefix (for example
team-a) can require requests liketeam-a/model-name. - With
force-model-prefix: true, unprefixed model calls are restricted. - Per-provider alias mappings can translate client-stable names to upstream names.
Example alias configuration:
yaml
codex-api-key:
- api-key: "sk-xxxx"
models:
- name: "gpt-5-codex"
alias: "codex-latest"Client request:
json
{ "model": "codex-latest", "messages": [{"role":"user","content":"hi"}] }Metrics and Routing Diagnosis
bash
# Per-provider rolling stats
curl -sS http://localhost:8317/v1/metrics/providers | jq
# Runtime health
curl -sS http://localhost:8317/healthUse these signals with logs to confirm if retries, throttling, or auth issues are driving fallback.
Common Routing Failure Modes
model_not_found: model alias/prefix not exposed by configured credentials.- Wrong provider selected: prefix overlap or non-explicit model name.
- High latency spikes: provider degraded; add retries or alternate providers.
- Repeated
429: insufficient credential pool for traffic profile. 400on Codex model via chat endpoint: retry with/v1/responsesand verify resolved model is Codex-family.