Skip to main content

Workers API

The Workers API is the internal wire protocol between the Zeridion.Flare SDK and the server. It is SDK-internal — you do not call these endpoints directly when using the SDK. They are documented here for completeness, for operators building custom worker implementations, and for debugging.

Base URL: https://api.zeridion.com/flare/v1

All endpoints require Bearer token authentication via an API key.

:::info SDK-internal If you are using the Zeridion.Flare NuGet SDK, the FlareWorkerService background service calls these endpoints automatically. You only need this reference if you are building a custom worker in another language or runtime. :::


Worker ID constraints

A worker_id uniquely identifies a single worker process. The SDK generates them automatically using the format:

wrk_{hostname}_{pid}_{random8}

For example: wrk_prod-host-01_12345_a1b2c3d4

Rules for custom worker IDs:

  • Must be a non-empty string.
  • Maximum 100 characters (DB column constraint).
  • Should be stable across reconnects within the same process lifetime; use a new ID when the process restarts.

POST /flare/v1/workers/register

Announce a worker to the server and declare which queues and job types it handles. Call this once at startup, before the first poll. If recurring_schedules are provided, the server upserts them as recurring job definitions.

Request

POST /flare/v1/workers/register
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
worker_idstringYesUnique identifier for this worker process.
queuesarray of stringsYesQueues this worker will poll. At least one entry required.
job_typesarray of stringsYesJob type names this worker can execute.
hostnamestringNoMachine hostname, used for observability in the dashboard.
sdk_versionstringNoSDK version string, used for compatibility tracking.
recurring_schedulesarrayNoRecurring job schedules to upsert on the server. See below.

recurring_schedules item

FieldTypeRequiredDescription
job_typestringYesJob type name, used as the recurring job ID prefix (rjob_{job_type}).
cron_expressionstringYesCron expression. Accepts the standard 5-field format or the 6-field format with an optional seconds field (parsed by the Cronos library — see its format spec for the full grammar).
queuestringNoQueue to enqueue into. Defaults to "default".
timezonestringNoIANA timezone for cron evaluation. Defaults to UTC.
max_attemptsintegerNoOverride max attempts for enqueued jobs.
timeout_secondsintegerNoOverride timeout for enqueued jobs.

Example

{
"worker_id": "wrk_prod-host-01_12345_a1b2c3d4",
"queues": ["default", "email"],
"job_types": ["email.send", "report.generate"],
"hostname": "prod-host-01",
"sdk_version": "0.1.0-beta.1",
"recurring_schedules": [
{
"job_type": "report.generate",
"cron_expression": "0 3 * * *",
"queue": "default",
"timezone": "America/New_York"
}
]
}

Response

200 OK

{ "status": "registered" }

Errors

StatusCodeCondition
400invalid_requestValidation failed (missing worker_id, empty queues, etc.)

POST /flare/v1/workers/poll

Dequeue up to capacity jobs that match the worker's queues and job types. Returns an empty jobs array (not 204) when no work is available — the SDK implements a backoff loop internally.

Request

POST /flare/v1/workers/poll
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDefaultDescription
worker_idstringYesIdentifies the polling worker.
queuesarray of stringsYesQueues to poll. At least one required.
capacityintegerNo1Maximum jobs to dequeue in a single call (1–50).
job_typesarray of stringsNoIf provided, only jobs with a matching job_type are returned.

Example

{
"worker_id": "wrk_prod-host-01_12345_a1b2c3d4",
"queues": ["default", "email"],
"capacity": 5
}

Response

200 OK

{
"jobs": [
{
"id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"job_type": "email.send",
"payload": { "to": "user@example.com" },
"attempt": 1,
"max_attempts": 3,
"timeout_seconds": 1800,
"enqueued_at": "2026-03-18T15:30:00Z"
}
]
}
FieldTypeDescription
jobsarrayDequeued jobs. Empty array when no work is available.
idstringJob identifier. Pass to /ack and /heartbeat.
job_typestringJob type name for dispatch.
payloadobjectThe job's JSON payload.
attemptintegerCurrent attempt number (1-based on first execution).
max_attemptsintegerMaximum attempts allowed.
timeout_secondsintegerPer-attempt timeout. Enforce this in your executor.
enqueued_atstring (ISO 8601)When the job was last enqueued or re-queued for retry.

Errors

StatusCodeCondition
400invalid_requestValidation failed (missing fields, capacity out of range)

POST /flare/v1/workers/ack

Report the outcome of a job execution. Must be called after every job attempt, whether it succeeded or failed. The server updates job state, records error details, schedules the next retry if applicable, and activates continuation jobs on success.

Request

POST /flare/v1/workers/ack
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
job_idstringYesID of the job being acknowledged.
worker_idstringYesID of the worker that executed the job.
statusstringYesOutcome: "succeeded" or "failed".
duration_msintegerNoWall-clock execution time in milliseconds.
errorobjectNoRequired when status is "failed". See error detail below.

Error detail object

FieldTypeDescription
typestringException type name.
messagestringException message.
stack_tracestringStack trace string.

Success example

{
"job_id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"worker_id": "wrk_prod-host-01_12345_a1b2c3d4",
"status": "succeeded",
"duration_ms": 1042
}

Failure example

{
"job_id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"worker_id": "wrk_prod-host-01_12345_a1b2c3d4",
"status": "failed",
"duration_ms": 503,
"error": {
"type": "System.Net.Http.HttpRequestException",
"message": "Connection refused",
"stack_trace": " at System.Net.Http.HttpClient.SendAsync..."
}
}

Response

200 OK

{
"action": "succeeded",
"retry_at": null,
"children_activated": 2
}
FieldTypeDescription
actionstringResolved outcome: "succeeded", "failed", "retry", or "dead_letter".
retry_atstring (ISO 8601)When the next retry is scheduled (with backoff), or null if not retrying.
children_activatedintegerNumber of continuation jobs moved to pending on success, or null.

Errors

StatusCodeCondition
400invalid_requestValidation failed or status is not a valid value.
404job_not_foundJob does not exist or does not belong to this project.
409invalid_stateJob is not in processing state.

POST /flare/v1/workers/heartbeat

Report that a long-running job is still alive and optionally update its progress. The server resets the job's liveness deadline, preventing the stuck-job reaper from cancelling it.

Call this endpoint at regular intervals (every 15–30 seconds is typical) for any job that may run longer than its timeout_seconds.

Liveness and the stuck-job reaper

The timestamp of the most recent heartbeat on a job is the liveness signal a background reaper uses to detect dead workers. A processing job is considered stuck when either:

  • A heartbeat was received and the gap between now and the last heartbeat exceeds approximately two-thirds of the job's timeout_seconds (subject to a minimum grace period); or
  • A heartbeat was never received and the job started more than timeout_seconds ago.

When the reaper trips it either retries the job (if attempts remain) or moves it to dead_letter and triggers any configured job_dead_letter alerts. Once that happens, any subsequent heartbeat from the stale worker fails with 409 invalid_state (the job is no longer in processing). The separate status: "cancel" response shape (see below) is reserved for the case where the job has been cancelled out-of-band — typically via POST /flare/v1/jobs/{id}/cancel — and signals that the worker MUST abort the in-flight attempt without calling /ack.

Request

POST /flare/v1/workers/heartbeat
Authorization: Bearer <api_key>
Content-Type: application/json

Body

FieldTypeRequiredDescription
job_idstringYesID of the in-flight job.
worker_idstringYesID of the worker executing the job.
progressnumberNoProgress value in [0.0, 1.0]. Values outside this range are silently ignored — in particular, reports above 1.0 are discarded rather than capped. The server stores MAX(existing, reported) so out-of-order frames are safe.

Example

{
"job_id": "job_01HYX3K7M8N9P2Q4R5S6T7U8V9",
"worker_id": "wrk_prod-host-01_12345_a1b2c3d4",
"progress": 0.65
}

Response

200 OK

{ "status": "ok" }
FieldTypeDescription
statusstring"ok" — heartbeat accepted, keep running. "cancel" — the job has been cancelled out-of-band (typically via POST /flare/v1/jobs/{id}/cancel) and the worker MUST abort the current attempt immediately and not call /ack.

Errors

StatusCodeCondition
400invalid_requestMissing job_id or worker_id.
404job_not_foundJob does not exist or belongs to another project.

See also