Rate Limits

All /flare/v1/* endpoints are rate-limited per project based on the project's plan tier. Limits are enforced using a sliding-window algorithm.

Rate Limit Tiers

Plan	Requests / hour	Price
Free	1,000	Free
Starter	5,000	$29/mo
Pro	50,000	$99/mo
Business	500,000	$299/mo
Enterprise	Custom	Custom

The requests-per-hour limit applies to all API calls (job creation, polling, listing, metrics, etc.). The limit is enforced per project based on the project's plan.

Monthly Job Allowance

In addition to per-hour rate limits, each project has a monthly job-creation allowance based on its plan tier. This limit applies only to POST /flare/v1/jobs (job creation) and resets at the start of each billing period.

Plan	Jobs / month	Price
Free	50,000	Free
Starter	300,000	$29/mo
Pro	3,000,000	$99/mo
Business	30,000,000	$299/mo
Enterprise	Custom	Custom

When the allowance is reached, behavior depends on whether opt-in overage is enabled for the project:

Overage disabled (the default, and always the case on Free) — new job creates return 429 monthly_allowance_exceeded until the next billing period.
Overage enabled — job creates continue; the extra volume is metered and billed at the plan's overage rate ($1.50 / $1.00 / $0.50 per 10,000 jobs on Starter / Pro / Business). A customer-set spend cap bounds the monthly overage charge; once it is reached, new creates return 429 spend_cap_reached.

Quota response headers

Every POST /flare/v1/jobs response includes quota headers:

Header	Type	Description
`X-Quota-Limit`	integer	Monthly job allowance for this project
`X-Quota-Used`	integer	Jobs created in the current billing period
`X-Quota-Overage`	integer	Jobs created beyond the allowance this period
`X-Quota-Overage-Cost`	integer	Accrued overage charge this period, in cents
`X-Quota-Spend-Cap`	integer	Overage spend cap in cents (omitted when no cap is set)
`X-Quota-Reset`	integer	Unix timestamp (seconds) of the billing-period end

429 Allowance Exceeded

When the monthly allowance is exhausted and overage is disabled, the API returns a 429 status code with error code monthly_allowance_exceeded:

{
  "error": {
    "code": "monthly_allowance_exceeded",
    "message": "Monthly job allowance of 50000 reached. Enable overage or upgrade your plan.",
    "request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
  }
}

If overage is enabled but the spend cap is reached, the code is spend_cap_reached instead. The X-Quota-* headers are included on 429 responses, so you can read X-Quota-Reset to know when the allowance resets.

Handling allowance exceeded

var response = await httpClient.PostAsync("/flare/v1/jobs", content);

if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
    // Distinguish an allowance/overage rejection from a rate limit
    var body = await response.Content.ReadAsStringAsync();
    if (body.Contains("monthly_allowance_exceeded") || body.Contains("spend_cap_reached"))
    {
        if (response.Headers.TryGetValues("X-Quota-Reset", out var values)
            && long.TryParse(values.First(), out var resetEpoch))
        {
            var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
            Console.WriteLine($"Allowance resets at {resetTime:u}");
        }

        // Wait for the next billing period, enable overage, or upgrade your plan
        return;
    }
}

tip

Distinguish a rate-limit rejection (rate_limit_exceeded) from an allowance/overage rejection (monthly_allowance_exceeded or spend_cap_reached) by checking the error.code field in the response body.

Response Headers

Every successful response to a /flare/v1/* endpoint includes rate limit headers:

Header	Type	Description
`X-RateLimit-Limit`	integer	Maximum requests allowed per hour for this project
`X-RateLimit-Remaining`	integer	Requests remaining in the current window
`X-RateLimit-Reset`	integer	Unix timestamp (seconds) when the current window resets

Example response headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4832
X-RateLimit-Reset: 1710779400
Content-Type: application/json

429 Too Many Requests

When the rate limit is exhausted, the API returns a 429 status code with the standard error envelope:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit of 5000 requests per hour exceeded.",
    "request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
  }
}

The X-RateLimit-* headers are still included on 429 responses, so you can read X-RateLimit-Reset to know when to retry.

Export endpoint cooldowns

In addition to the per-project hourly bucket, export endpoints carry their own per-tenant cooldown of 1 request per minute and return 429 rate_limit_exceeded (with Retry-After) when invoked again inside the same 60-second window:

POST /flare/v1/jobs/export — 1 export per minute per project.
GET /platform/v1/projects/{projectId}/audit-log/export — 1 export per minute per project.

The two buckets are tracked separately, so a jobs-export call does not consume the audit-export budget. The cooldown is independent of the per-project hourly rate limit — a tenant on the Business plan can still trip the export cooldown.

Handling Rate Limits

Recommended backoff strategy

Read the X-RateLimit-Reset header from the 429 response
Compute the wait time: reset_timestamp - current_timestamp
Wait that duration before retrying

C# example

var response = await httpClient.PostAsync("/flare/v1/jobs", content);

if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
    if (response.Headers.TryGetValues("X-RateLimit-Reset", out var values)
        && long.TryParse(values.First(), out var resetEpoch))
    {
        var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
        var delay = resetTime - DateTimeOffset.UtcNow;

        if (delay > TimeSpan.Zero)
            await Task.Delay(delay);
    }

    // Retry the request
    response = await httpClient.PostAsync("/flare/v1/jobs", content);
}

Proactive throttling

To avoid hitting the limit in the first place, monitor the X-RateLimit-Remaining header on every response. When it drops below a threshold (e.g., 10% of X-RateLimit-Limit), reduce your request rate.

SDK Behavior

The Zeridion.Flare SDK surfaces rate limit information but does not perform automatic 429-aware backoff:

When a 429 response is received, the SDK throws a FlareRateLimitException with Limit, Remaining, and ResetAt properties derived from the X-RateLimit-* response headers.
The SDK's background worker catches all exceptions (including FlareRateLimitException) and waits for the configured PollInterval before retrying. It does not inspect the ResetAt header to compute a longer backoff. For enqueue calls in your application code, you should implement your own retry/backoff logic using the ResetAt property (see the C# example above).

See the SDK Exceptions reference for details on FlareRateLimitException.

Rate Limit Tiers​

Monthly Job Allowance​

Quota response headers​

429 Allowance Exceeded​

Handling allowance exceeded​

Response Headers​

Example response headers​

429 Too Many Requests​

Export endpoint cooldowns​

Handling Rate Limits​

Recommended backoff strategy​

C# example​

Proactive throttling​

SDK Behavior​