Skip to main content

Rate Limits

All /flare/v1/* endpoints are rate-limited per project based on the project's plan tier. Limits are enforced using a sliding-window algorithm.

Rate Limit Tiers

PlanRequests / hourPrice
Free1,000Free
Starter5,000$29/mo
Pro50,000$99/mo
Business500,000$299/mo
EnterpriseCustomCustom

The requests-per-hour limit applies to all API calls (job creation, polling, listing, metrics, etc.). The limit is enforced per project based on the project's plan.

Monthly Job Allowance

In addition to per-hour rate limits, each project has a monthly job-creation allowance based on its plan tier. This limit applies only to POST /flare/v1/jobs (job creation) and resets at the start of each billing period.

PlanJobs / monthPrice
Free50,000Free
Starter300,000$29/mo
Pro3,000,000$99/mo
Business30,000,000$299/mo
EnterpriseCustomCustom

When the allowance is reached, behavior depends on whether opt-in overage is enabled for the project:

  • Overage disabled (the default, and always the case on Free) — new job creates return 429 monthly_allowance_exceeded until the next billing period.
  • Overage enabled — job creates continue; the extra volume is metered and billed at the plan's overage rate ($1.50 / $1.00 / $0.50 per 10,000 jobs on Starter / Pro / Business). A customer-set spend cap bounds the monthly overage charge; once it is reached, new creates return 429 spend_cap_reached.

Quota response headers

Every POST /flare/v1/jobs response includes quota headers:

HeaderTypeDescription
X-Quota-LimitintegerMonthly job allowance for this project
X-Quota-UsedintegerJobs created in the current billing period
X-Quota-OverageintegerJobs created beyond the allowance this period
X-Quota-Overage-CostintegerAccrued overage charge this period, in cents
X-Quota-Spend-CapintegerOverage spend cap in cents (omitted when no cap is set)
X-Quota-ResetintegerUnix timestamp (seconds) of the billing-period end

429 Allowance Exceeded

When the monthly allowance is exhausted and overage is disabled, the API returns a 429 status code with error code monthly_allowance_exceeded:

{
"error": {
"code": "monthly_allowance_exceeded",
"message": "Monthly job allowance of 50000 reached. Enable overage or upgrade your plan.",
"request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
}
}

If overage is enabled but the spend cap is reached, the code is spend_cap_reached instead. The X-Quota-* headers are included on 429 responses, so you can read X-Quota-Reset to know when the allowance resets.

Handling allowance exceeded

var response = await httpClient.PostAsync("/flare/v1/jobs", content);

if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
// Distinguish an allowance/overage rejection from a rate limit
var body = await response.Content.ReadAsStringAsync();
if (body.Contains("monthly_allowance_exceeded") || body.Contains("spend_cap_reached"))
{
if (response.Headers.TryGetValues("X-Quota-Reset", out var values)
&& long.TryParse(values.First(), out var resetEpoch))
{
var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
Console.WriteLine($"Allowance resets at {resetTime:u}");
}

// Wait for the next billing period, enable overage, or upgrade your plan
return;
}
}
tip

Distinguish a rate-limit rejection (rate_limit_exceeded) from an allowance/overage rejection (monthly_allowance_exceeded or spend_cap_reached) by checking the error.code field in the response body.

Response Headers

Every successful response to a /flare/v1/* endpoint includes rate limit headers:

HeaderTypeDescription
X-RateLimit-LimitintegerMaximum requests allowed per hour for this project
X-RateLimit-RemainingintegerRequests remaining in the current window
X-RateLimit-ResetintegerUnix timestamp (seconds) when the current window resets

Example response headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4832
X-RateLimit-Reset: 1710779400
Content-Type: application/json

429 Too Many Requests

When the rate limit is exhausted, the API returns a 429 status code with the standard error envelope:

{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit of 5000 requests per hour exceeded.",
"request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
}
}

The X-RateLimit-* headers are still included on 429 responses, so you can read X-RateLimit-Reset to know when to retry.

Export endpoint cooldowns

In addition to the per-project hourly bucket, export endpoints carry their own per-tenant cooldown of 1 request per minute and return 429 rate_limit_exceeded (with Retry-After) when invoked again inside the same 60-second window:

  • POST /flare/v1/jobs/export — 1 export per minute per project.
  • GET /platform/v1/projects/{projectId}/audit-log/export — 1 export per minute per project.

The two buckets are tracked separately, so a jobs-export call does not consume the audit-export budget. The cooldown is independent of the per-project hourly rate limit — a tenant on the Business plan can still trip the export cooldown.

Handling Rate Limits

  1. Read the X-RateLimit-Reset header from the 429 response
  2. Compute the wait time: reset_timestamp - current_timestamp
  3. Wait that duration before retrying

C# example

var response = await httpClient.PostAsync("/flare/v1/jobs", content);

if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
if (response.Headers.TryGetValues("X-RateLimit-Reset", out var values)
&& long.TryParse(values.First(), out var resetEpoch))
{
var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
var delay = resetTime - DateTimeOffset.UtcNow;

if (delay > TimeSpan.Zero)
await Task.Delay(delay);
}

// Retry the request
response = await httpClient.PostAsync("/flare/v1/jobs", content);
}

Proactive throttling

To avoid hitting the limit in the first place, monitor the X-RateLimit-Remaining header on every response. When it drops below a threshold (e.g., 10% of X-RateLimit-Limit), reduce your request rate.

SDK Behavior

The Zeridion.Flare SDK surfaces rate limit information but does not perform automatic 429-aware backoff:

  • When a 429 response is received, the SDK throws a FlareRateLimitException with Limit, Remaining, and ResetAt properties derived from the X-RateLimit-* response headers.
  • The SDK's background worker catches all exceptions (including FlareRateLimitException) and waits for the configured PollInterval before retrying. It does not inspect the ResetAt header to compute a longer backoff. For enqueue calls in your application code, you should implement your own retry/backoff logic using the ResetAt property (see the C# example above).

See the SDK Exceptions reference for details on FlareRateLimitException.