Rate Limits & Quotas
HTTP Response
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
{
"status": 429,
"title": "rate_limit_exceeded",
"detail": "Request quota exceeded. Retry after the indicated interval."
}SDK Behavior
Catching Rate Limit Errors
from highflame import Highflame, RateLimitError
client = Highflame(api_key="hf_sk_...")
try:
resp = client.guard.evaluate_prompt(user_input)
except RateLimitError as e:
# All retries exhausted
print(f"Rate limited: {e.status} — {e.detail}")
# Return a graceful degradation response
return {"decision": "allow", "degraded": True}Adjusting Retry Behavior
Retry Policy Details
Trigger
Retried?
Notes
High-Volume Workloads
Quota Increases
Last updated