diff --git a/CHANGELOG.md b/CHANGELOG.md index e59d9f9f..24ff786e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ All notable changes to this project will be documented in this file. - Added /api/v1/checks/uuid/flips/ endpoint (#349) - In the cron expression dialog, show a human-friendly version of the expression - Indicate a started check with a progress spinner under status icon (#338) +- Added "Docs > Reliability Tips" page ### Bug Fixes - Removing Pager Team integration, project appears to be discontinued diff --git a/templates/docs/api.html b/templates/docs/api.html index 3c3d2554..2c798116 100644 --- a/templates/docs/api.html +++ b/templates/docs/api.html @@ -1,5 +1,5 @@ -
SITE_NAME REST API supports listing, creating, updating, pausing and deleting +
SITE_NAME Management API supports listing, creating, updating, pausing and deleting checks in user's account.
Your requests to SITE_NAME REST API must authenticate using an +
Your requests to SITE_NAME Management API must authenticate using an API key. Each project in your SITE_NAME account has separate API keys. There are no account-wide API keys. By default, a project on SITE_NAME doesn't have an API key. You can create read-write and read-only API keys in the diff --git a/templates/docs/api.md b/templates/docs/api.md index 631c8bec..16a55892 100644 --- a/templates/docs/api.md +++ b/templates/docs/api.md @@ -1,6 +1,6 @@ -# API Reference +# Management API -SITE_NAME REST API supports listing, creating, updating, pausing and deleting +SITE_NAME Management API supports listing, creating, updating, pausing and deleting checks in user's account. ## API Endpoints @@ -20,7 +20,7 @@ Endpoint Name | Endpoint Address ## Authentication -Your requests to SITE_NAME REST API must authenticate using an +Your requests to SITE_NAME Management API must authenticate using an API key. Each project in your SITE_NAME account has separate API keys. There are no account-wide API keys. By default, a project on SITE_NAME doesn't have an API key. You can create read-write and read-only API keys in the diff --git a/templates/docs/configuring_notifications.html b/templates/docs/configuring_notifications.html index cd8786a8..ea832219 100644 --- a/templates/docs/configuring_notifications.html +++ b/templates/docs/configuring_notifications.html @@ -16,7 +16,8 @@ set up in each project separately.
In the web interface, the list of checks shows a visual overview of which alerting methods are enabled for each check. You can click the icons to toggle them on and off:
-You can also toggle the integrations on and off when viewing an individual check:
+You can also toggle the integrations on and off when viewing an individual check by +clicking on the "ON" / "OFF" labels:
SITE_NAME limits the maximum number of SMS and WhatsApp notifications an account diff --git a/templates/docs/configuring_notifications.md b/templates/docs/configuring_notifications.md index c73ad488..5133d0e4 100644 --- a/templates/docs/configuring_notifications.md +++ b/templates/docs/configuring_notifications.md @@ -20,7 +20,8 @@ methods are enabled for each check. You can click the icons to toggle them on an ![Integration icons in the checks list](IMG_URL/checks_integrations.png) -You can also toggle the integrations on and off when viewing an individual check: +You can also toggle the integrations on and off when viewing an individual check by +clicking on the "ON" / "OFF" labels: ![Integration on/off toggles in the check details page](IMG_URL/details_integrations.png) diff --git a/templates/docs/http_api.html b/templates/docs/http_api.html index 59550d10..abb1c8a8 100644 --- a/templates/docs/http_api.html +++ b/templates/docs/http_api.html @@ -1,6 +1,6 @@ -
The SITE_NAME pinging API is used for submitting success, failure and job start -signals from the monitored systems.
+The SITE_NAME pinging API is used for submitting "start", "success" and "fail" +signals ("pings") from the monitored systems.
All ping endpoints support:
Successful responses will have the "200 OK" HTTP response status code and a short and simple string "OK" in the response body.
-HEAD|GET|POST PING_ENDPOINT{uuid}
HEAD|GET|POST PING_ENDPOINT{uuid}/fail
HEAD|GET|POST PING_ENDPOINT{uuid}/start
Sending monitoring signals over public internet is inherently unreliable. +HTTP requests can sometimes take excessively long or fail completely +for a variety of reasons. Here are some general tips to make your monitoring +code more robust.
+Put a time limit on how long each ping is allowed to take. This is especially +important when sending a "start" signal at the start of a job: you don't want +a stuck ping prevent the actual job from running. Another case is a continuously +running worker process which pings SITE_NAME after each completed item. A stuck +request would block the whole process, so it is important to guard against.
+Specifying the timeout depends on the tool you use. curl, for example, has the
+--max-time
(shorthand: -m
) parameter:
# Send a HTTP, 10 second timeout:
+curl -m 10 PING_URL
+
To minimize the amount of false alerts you get from SITE_NAME, instruct your HTTP +client to retry failed requests several times.
+Specifying the retry policy depends on the tool you use. curl, for example, has the
+--retry
parameter:
# Retry up to 5 times, uses an increasing delay between each retry (1s, 2s, 4s, 8s, ...)
+curl --retry 5 PING_URL
+
Make sure you know how your HTTP client handles failed requests. For example, +if you use a HTTP library which raises exceptions, decide if you want to +catch the exceptions, or let them bubble up.
\ No newline at end of file diff --git a/templates/docs/reliability_tips.md b/templates/docs/reliability_tips.md new file mode 100644 index 00000000..e6de83c5 --- /dev/null +++ b/templates/docs/reliability_tips.md @@ -0,0 +1,41 @@ +# Pinging Reliability Tips + +Sending monitoring signals over public internet is inherently unreliable. +HTTP requests can sometimes take excessively long or fail completely +for a variety of reasons. Here are some general tips to make your monitoring +code more robust. + +## Specify HTTP Request Timeout + +Put a time limit on how long each ping is allowed to take. This is especially +important when sending a "start" signal at the start of a job: you don't want +a stuck ping prevent the actual job from running. Another case is a continuously +running worker process which pings SITE_NAME after each completed item. A stuck +request would block the whole process, so it is important to guard against. + +Specifying the timeout depends on the tool you use. curl, for example, has the +`--max-time` (shorthand: `-m`) parameter: + +```bash +# Send a HTTP, 10 second timeout: +curl -m 10 PING_URL +``` + +## Use Retries + +To minimize the amount of false alerts you get from SITE_NAME, instruct your HTTP +client to retry failed requests several times. + +Specifying the retry policy depends on the tool you use. curl, for example, has the +`--retry` parameter: + +```bash +# Retry up to 5 times, uses an increasing delay between each retry (1s, 2s, 4s, 8s, ...) +curl --retry 5 PING_URL +``` + +## Handle Exceptions + +Make sure you know how your HTTP client handles failed requests. For example, +if you use a HTTP library which raises exceptions, decide if you want to +catch the exceptions, or let them bubble up. diff --git a/templates/front/base_docs.html b/templates/front/base_docs.html index 7c8d6adf..349009ba 100644 --- a/templates/front/base_docs.html +++ b/templates/front/base_docs.html @@ -7,18 +7,21 @@