title: "Multi-region URL monitoring" description: "Probe the same public endpoint from servers in different regions to catch regional outages without a separate probe network." last_updated: "2026-05-24"
Multi-region URL monitoring
Pro+You run a public API at https://api.yourapp.com. You want to know not just "is it up?" but "is it up from where my users actually are?" Geographic vantage matters: an outage in us-east-1 looks fine from your eu-west-1 monitoring box, and vice versa.
This recipe sets up one HTTP uptime check assigned to one server per region. BoxWatch aggregates the probe results; you get a single check that reflects regional reality.
What you'll end up with
- One HTTP check called
api.yourapp.com health. - Three probe servers, one each in US-East, US-West, and EU.
- Aggregated status that goes
downonly when a majority of regions can't reach you. - Per-region latency charts as a free side-effect.
Why agents in your regions, not a SaaS probe network
The honest answer: because you already have servers there. If your fleet spans us-east-1, us-west-2, and eu-west-1, the geographic diversity of your monitoring topology comes for free. No per-region billing, no extra vendor.
The trade-off is that "probe regions" are wherever your boxes happen to live. If you want a probe in Tokyo and you don't have a server in Tokyo, you can't have one. The advice is to spin up one cheap VPS per region you care about and install the agent there — it's still cheaper than a per-check billing tier on a SaaS probe service.
Prerequisites
- BoxWatch Pro plan or higher.
- One BoxWatch agent installed per region you want to probe from. We'll assume three:
us-east-monitor,us-west-monitor,eu-monitor. Install instructions at Installing the agent. - The public URL you want to probe. We'll use
https://api.yourapp.com/health— assumed to return200 OKwith a body containing"OK".
Step 1: add an HTTP uptime check
Dashboard
- Dashboard → Uptime → New check.
- Name:
api.yourapp.com health. - Check type: HTTP.
- Target URL:
https://api.yourapp.com/health. - Expected status codes:
200(or200-299if your health endpoint sometimes returns 204). - Body contains (optional):
OK. Catches the case where your health endpoint returns 200 but the body says "DEGRADED." - Max latency (optional):
2000ms. Triggers a fail if the response takes longer than 2 seconds. - Follow redirects: usually on.
- Timeout: 10 seconds.
- Probe servers: pick
us-east-monitor,us-west-monitor,eu-monitor. - Save.
API
export TOKEN="bw_..."
curl -fsS -X POST https://api.boxwatch.app/uptime-checks \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "api.yourapp.com health",
"check_type": "http",
"target": "https://api.yourapp.com/health",
"expected_status_codes": "200",
"max_latency_ms": 2000,
"body_contains": "OK",
"follow_redirects": 1,
"timeout_seconds": 10,
"probe_server_ids": [11, 22, 33],
"alert_on_down": 1,
"alert_on_recovery": 1
}'Field notes:
targetis the full URL (with scheme), not separate host/port fields. Validator requireshttp://orhttps://.expected_status_codesaccepts a single code (200), a comma list (200,204,301), or a range (200-299).body_containsis a substring match, not a regex. Up to 500 characters. Case-sensitive.probe_server_idsis the list of agents that will run the probe — typically one per region.
See the full Uptime Checks API reference.
Step 2: what aggregation looks like in practice
You have three regions. Each one probes the URL on its heartbeat (every minute on Pro/Team, every minute on Scale). BoxWatch aggregates the three results:
- 3 of 3 OK →
up. - 1 of 3 down →
degraded. Visible on the dashboard. No alert. - 2 of 3 down →
down(strict majority). After two consecutivedownaggregations, alert fires. - 3 of 3 down →
down, alert after second consecutive sample.
The case that makes regional monitoring interesting is "1 of 3 down." That's where you see a problem isolated to one region. The dashboard shows yellow and tells you exactly which region — the per-probe table has one row per probe server with its last result. That data is often what you actually want during an incident, even when the global aggregate hasn't tipped over.
With only two vantages, "1 of 2 down" is exactly half — not a strict majority. The check stays degraded rather than going down. Practical upshot: if you only have two probe servers and one of them is in a region having an outage, you won't get alerted. Two regions is enough to detect a regional issue (you'll see it in the dashboard), but you need at least three to alert on one. Three is the magic number.
Step 3: route alerts
Email is on by default. Add Slack at Dashboard → Account → Notifications with your incoming-webhook URL. See Slack alerts.
When the check goes down, the alert message names the check, the time, and the failing probes — so the on-call engineer can immediately see "EU and US-West are failing, US-East is fine" and start triaging accordingly.
What you'll see in the dashboard
- A combined latency chart with one line per probe server. Regional latency variance becomes visible at a glance — if your EU box is suddenly showing 800ms when it usually shows 80ms, that's a signal even when nothing is technically down.
- Per-probe status pills: each region shows
up/down/pendingwith a timestamp. - Recent probe results: status code, latency, error kind for each attempt.
Layer this with other signals
Multi-region URL monitoring is most honest when you treat it as one input among several:
- Cloud-provider region status pages tell you about provider outages directly. BoxWatch tells you whether your specific app is reachable through them.
- Server-side process monitoring on your API hosts tells you the API process is alive. See Monitor nginx across multiple hosts for that pattern.
- Synthetic uptime catches things process monitoring misses — bad config, broken upstream, 500s under load.
Together, they answer different questions. Use them all.
Common gotchas
- Probes from inside your own VPC. If your probe server is in the same VPC as the API, you're measuring intra-VPC reachability, not user reachability. Put at least one probe server somewhere outside your private network.
- Health endpoint that lies. A
/healthendpoint that returns 200 regardless of dependency state is useless here. Thebody_containscheck catches the simplest version of this ("OK"vs"DEGRADED"), but a well-designed health endpoint that actually exercises dependencies is worth its weight. - Aggressive
max_latency_ms. A 200 ms threshold sounds reasonable until your EU-to-US-East probe starts hitting 250 ms during evening traffic and you wake up at 2 AM for a non-incident. Pick a number above your worst legitimate p95.