Skip to main content

Status & SLO

Live uptime, target SLO, and how to be notified before your monitoring catches the same incident.


Live status page

status.zeridion.com — real-time status of every public surface.

The status page covers:

  • APIapi.zeridion.com/flare/v1/* and api.zeridion.com/platform/v1/*
  • Dashboarddashboard.zeridion.com
  • Marketing & docszeridion.com, docs.zeridion.com
  • Webhook delivery — outbound deliveries to customer endpoints (aggregate, not per-customer)
  • Job processing — worker poll / ack / heartbeat success rate

Each component shows: current status (operational / degraded / partial outage / major outage), an uptime history for the last 90 days, and a feed of past incidents with post-mortems.


SLO target

The SLO we commit to internally (a tighter bar than the contractual SLA):

SurfaceSLO target
API (api.zeridion.com)99.95% monthly uptime, p99 latency < 250ms for read endpoints, p99 < 750ms for write endpoints
Job processing pipeline99.95% — every accepted job reaches a terminal state (succeeded / failed / dead_letter / cancelled); no jobs lost
Webhook delivery99.5% first-attempt success when receiver is healthy (after 5 retries with exponential backoff)
Dashboard99.9% — covered by SLA on paid plans; best-effort on Free
Marketing & docs99.5% best-effort — no SLA

The contractual SLA commits to 99.9% monthly uptime for API + job processing on paid plans. The internal SLO is tighter so we have headroom to absorb a single incident inside any given month without breaching contract.


Subscribe to incidents

Three ways to be notified, in order of immediacy:

  1. Email — subscribe at status.zeridion.com (click "Subscribe"). One email per status change.
  2. RSS / Atomstatus.zeridion.com/history.rss and /history.atom. Polls cleanly into Slack, Discord, or PagerDuty.
  3. Webhook — Enterprise customers can register a Slack / Microsoft Teams / PagerDuty webhook on the status page to be paged on P1 incidents that affect their plan tier or region.

For component-specific subscriptions (e.g. "only page me when the API in EU goes degraded"), use the per-component subscribe flow on the status page.


Recent incident history

Live history at status.zeridion.com/history. Per-incident post-mortems are published within 5 business days for P1 / P2 outages.

Public preview disclaimer: Zeridion Flare is in public preview. The platform has not yet experienced a customer-impacting incident; the status page will publish the first post-mortem within 5 business days of any future P1 / P2.


Maintenance windows

Planned maintenance is announced on the status page at least 72 hours in advance and scheduled outside business hours in your account's primary region. Maintenance windows are excluded from SLA uptime calculations. Subscribed channels (email / RSS / webhook) are notified at announcement, T-24 hours, T-1 hour, and at start / end.

Emergency maintenance (security patch, infrastructure incident response) may bypass the 72-hour notice. We will still notify subscribed channels and document the incident on the status page within 24 hours.


SLA credit requests

Hit a multi-hour outage that crossed the SLA threshold? See the SLA page § Credit Request Process for the request form. The status page incident ID is the only identifier needed.


See also

  • SLA — contractual uptime commitment, credit schedule, dispute process
  • Support & SLA — response-time SLA by plan, security and abuse contacts
  • Limits & Quotas — what counts against each enforced cap
  • Monitoring guide — building your own dashboards on top of the metrics API