Uptime SLA

Every server has an uptime percentage over four rolling windows: 24 hours, 7 days, 30 days, and 90 days. The numbers live in the server detail page's top bar.

This page explains exactly what the percentage measures, what counts as "down," and how maintenance windows fit in.

What it measures

The uptime number is a measure of agent reachability. It answers: "what fraction of the window did this server's agent successfully report?" It is not a hardware or kernel SLA. A server that's powered on but can't reach the API counts as down. A server with a busted kernel that still manages to fire off the cron job counts as up.

How it's calculated

For each window:

The eligible_minutes start at the size of the window (e.g. 1440 for 24h) or the server's age, whichever is smaller.
Any active maintenance windows in the period are subtracted from eligible_minutes.
The API walks the timestamps of received heartbeats in the window. Any gap larger than 15 minutes between consecutive heartbeats is downtime — minus a 5-minute tolerance, so a 16-minute gap counts as 11 minutes of downtime.
The final gap (from the last heartbeat to "now") is treated the same way.
uptime% = (eligible_minutes - downtime_minutes) / eligible_minutes.

The 15-minute floor is deliberate. The 5-minute push interval means a single missed heartbeat never registers as downtime; you would need a sustained multi-interval outage to move the number.

Where you see it

On the server detail page, top of the screen:

24h: 100.0%   7d: 99.8%   30d: 99.9%   90d: 99.6%

Each is rounded to one decimal place. The 7-day average is also surfaced on the main server list as a small badge.

The API returns the same numbers on GET /servers/:id:

{
  "server": {
    "uptime": {
      "h24": 100.0,
      "d7":  99.8,
      "d30": 99.9,
      "d90": 99.6
    }
  }
}

Maintenance and SLA

Time spent inside an active maintenance window is removed from eligible_minutes. A planned 2-hour reboot on a 24-hour window means the divisor shrinks from 1440 to 1320 — the maintenance doesn't drag your number down, but it also doesn't inflate it artificially. See Maintenance windows for the rules.

Uptime SLA vs. synthetic uptime checks

There are two "uptime" concepts in BoxWatch, and they answer different questions:

Uptime SLA (this page) — is the box alive enough to run cron and reach our API? Host-level. Always on, no configuration.
Synthetic uptime checks — is the application listening, returning the right status, with valid TLS? Service-level. Configured per-check, runs from one or more of your servers. See Synthetic uptime checks.

Use both. A box can have 100% SLA and still serve 500s.

Notes and limits

Servers younger than the window report uptime based on their actual lifespan. A 2-day-old server's 30-day percentage is computed against its 48 hours of life, not 30 days of zeros.
The 15-minute downtime threshold is fixed in calculateUptime() and isn't per-server tunable in v1.
Numbers refresh on each request to /servers/:id — there's no background cache to invalidate.
Exporting uptime history as CSV isn't built into the dashboard in v1. Use the GET /servers/:id API and process the returned numbers in your own script.