Rate Limits

Current API rate-limit behavior.

Rate limiting is deployment-configured. When enabled, it is enforced per client key, usually the resolved client IP.

When the limit is exceeded, the API returns:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
{
  "code": "rate_limit_error",
  "message": "rate limit exceeded",
  "details": null
}

Retry-After is whole seconds and rounded up.

If the rate-limit backend fails, the middleware allows the request and logs the backend error. It does this to avoid turning a limiter outage into an API outage.

The current implementation does not emit X-RateLimit-* headers or define public plan-based limits in the API contract.