Operations API
Operations endpoints are used for liveness checks, routing visibility, and incident triage.
Audience Guidance
- SRE/ops: integrate these routes into health checks and dashboards.
- Developers: use them when debugging routing/performance behavior.
Core Endpoints
GET /healthfor liveness/readiness style checks.GET /v1/metrics/providersfor rolling provider-level performance/usage stats.
Monitoring Examples
Basic liveness check:
bash
curl -sS -f http://localhost:8317/healthProvider metrics snapshot:
bash
curl -sS http://localhost:8317/v1/metrics/providers | jqPrometheus-friendly probe command:
bash
curl -sS -o /dev/null -w '%{http_code}\n' http://localhost:8317/healthSuggested Operational Playbook
- Check
/healthfirst. - Inspect
/v1/metrics/providersfor latency/error concentration. - Correlate with request logs and model-level failures.
- Shift traffic (prefix/model/provider) when a provider degrades.
Failure Modes
- Health endpoint flaps: resource saturation or startup race.
- Provider metrics stale/empty: no recent traffic or exporter initialization issues.
- High error ratio on one provider: auth expiry, upstream outage, or rate-limit pressure.