Rate Limits
All /flare/v1/* endpoints are rate-limited per project based on the project's plan tier. Rate limiting is enforced by the RateLimitMiddleware using a sliding window algorithm.
Rate Limit Tiers
| Plan | Requests / hour | Price |
|---|---|---|
| Free | 100 | Free |
| Starter | 5,000 | $29/mo |
| Pro | 50,000 | $99/mo |
| Business | 500,000 | $299/mo |
The requests-per-hour limit applies to all API calls (job creation, polling, listing, metrics, etc.). The limit is enforced per project based on the project's RateLimitPerHour setting.
Daily Job Quota
In addition to per-hour rate limits, each project has a daily job creation quota based on its plan tier. This limit applies only to POST /flare/v1/jobs (job creation) and resets at midnight UTC.
| Plan | Jobs / day | Price |
|---|---|---|
| Free | 1,000 | Free |
| Starter | 10,000 | $29/mo |
| Pro | 100,000 | $99/mo |
| Business | 1,000,000 | $299/mo |
Quota response headers
Every POST /flare/v1/jobs response includes quota headers:
| Header | Type | Description |
|---|---|---|
X-Quota-Limit | integer | Maximum jobs allowed per day for this project |
X-Quota-Used | integer | Jobs created today so far |
X-Quota-Reset | integer | Unix timestamp (seconds) of next midnight UTC |
429 Daily Quota Exceeded
When the daily job quota is exhausted, the API returns a 429 status code with error code daily_quota_exceeded:
{
"error": {
"code": "daily_quota_exceeded",
"message": "Daily job creation quota of 1000 jobs exceeded.",
"request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
}
}
The X-Quota-* headers are still included on 429 responses, so you can read X-Quota-Reset to know when the quota resets.
Handling quota exceeded
var response = await httpClient.PostAsync("/flare/v1/jobs", content);
if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
// Check if this is a quota issue (vs. rate limit)
var body = await response.Content.ReadAsStringAsync();
if (body.Contains("daily_quota_exceeded"))
{
if (response.Headers.TryGetValues("X-Quota-Reset", out var values)
&& long.TryParse(values.First(), out var resetEpoch))
{
var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
Console.WriteLine($"Quota resets at {resetTime:u}");
}
// Either wait until midnight UTC or upgrade your plan
return;
}
}
You can distinguish between rate limit exhaustion (rate_limit_exceeded) and quota exhaustion (daily_quota_exceeded) by checking the error.code field in the response body.
Response Headers
Every successful response to a /flare/v1/* endpoint includes rate limit headers:
| Header | Type | Description |
|---|---|---|
X-RateLimit-Limit | integer | Maximum requests allowed per hour for this project |
X-RateLimit-Remaining | integer | Requests remaining in the current window |
X-RateLimit-Reset | integer | Unix timestamp (seconds) when the current window resets |
Example response headers
HTTP/1.1 200 OK
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4832
X-RateLimit-Reset: 1710779400
Content-Type: application/json
429 Too Many Requests
When the rate limit is exhausted, the API returns a 429 status code with the standard error envelope:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit of 5000 requests per hour exceeded.",
"request_id": "01JAXBKM3N4P5Q6R7S8T9UVWXY"
}
}
The X-RateLimit-* headers are still included on 429 responses, so you can read X-RateLimit-Reset to know when to retry.
Handling Rate Limits
Recommended backoff strategy
- Read the
X-RateLimit-Resetheader from the 429 response - Compute the wait time:
reset_timestamp - current_timestamp - Wait that duration before retrying
C# example
var response = await httpClient.PostAsync("/flare/v1/jobs", content);
if (response.StatusCode == System.Net.HttpStatusCode.TooManyRequests)
{
if (response.Headers.TryGetValues("X-RateLimit-Reset", out var values)
&& long.TryParse(values.First(), out var resetEpoch))
{
var resetTime = DateTimeOffset.FromUnixTimeSeconds(resetEpoch);
var delay = resetTime - DateTimeOffset.UtcNow;
if (delay > TimeSpan.Zero)
await Task.Delay(delay);
}
// Retry the request
response = await httpClient.PostAsync("/flare/v1/jobs", content);
}
Proactive throttling
To avoid hitting the limit in the first place, monitor the X-RateLimit-Remaining header on every response. When it drops below a threshold (e.g., 10% of X-RateLimit-Limit), reduce your request rate.
SDK Behavior
The Zeridion.Flare SDK surfaces rate limit information but does not perform automatic 429-aware backoff:
- When a 429 response is received, the SDK throws a
FlareRateLimitExceptionwithLimit,Remaining, andResetAtproperties derived from theX-RateLimit-*response headers. - The SDK's worker poll loop catches all exceptions (including
FlareRateLimitException) and waits for the configuredPollIntervalbefore retrying. It does not inspect theResetAtheader to compute a longer backoff. For enqueue calls in your application code, you should implement your own retry/backoff logic using theResetAtproperty (see the C# example above).
See the SDK Exceptions reference for details on FlareRateLimitException.