|
|
@ -4,96 +4,170 @@ |
|
|
|
{% block title %}Documentation - healthchecks.io{% endblock %} |
|
|
|
|
|
|
|
{% block content %} |
|
|
|
<div class="row"> |
|
|
|
<div class="col-sm-12"> |
|
|
|
<h3>Summary</h3> |
|
|
|
<p> |
|
|
|
Each check you create in <a href="{% url 'hc-index' %}">My Checks</a> |
|
|
|
page has an unique "ping" URL. Whenever you access this URL, |
|
|
|
the "Last Ping" value of corresponding check is updated. |
|
|
|
</p> |
|
|
|
<p>When a certain amount of time passes since last received ping, the |
|
|
|
check is considered "late", and Health Checks sends an email alert. |
|
|
|
It is all very simple, really.</p> |
|
|
|
|
|
|
|
<h3>Executing a Ping</h3> |
|
|
|
<p> |
|
|
|
At the end of your batch job, add a bit of code to request |
|
|
|
one of your ping URLs. |
|
|
|
</p> |
|
|
|
<ul> |
|
|
|
<li>HTTP and HTTPS protocols both are fine</li> |
|
|
|
<li>Request method can be GET or POST</li> |
|
|
|
<li>It does not matter what request headers you send</li> |
|
|
|
<li>You can leave request body empty or put anything in it, it's all good</li> |
|
|
|
</ul> |
|
|
|
|
|
|
|
<p>The response will have status code "200 OK" and response body will be a |
|
|
|
short and simple string "OK".</p> |
|
|
|
|
|
|
|
<p> |
|
|
|
In bash scripts, you can use <code>wget</code> or <code>curl</code> to run the requests: |
|
|
|
</p> |
|
|
|
<pre> |
|
|
|
curl {{ ping_endpoint }}{uuid-goes-here} |
|
|
|
</pre> |
|
|
|
|
|
|
|
<h3>When Alerts Are Sent</h3> |
|
|
|
<p> |
|
|
|
Each check has a configurable <strong>Period</strong> parameter, with the default value of one day. |
|
|
|
For periodic tasks, this is the expected time gap between two runs. |
|
|
|
</p> |
|
|
|
<p> |
|
|
|
Additionally, each check has a <strong>Grace</strong> parameter, with default value of one hour. |
|
|
|
You can use this parameter to account for run time variance of tasks. |
|
|
|
For example, if a backup task completes in 50 seconds one day, and |
|
|
|
completes in 60 seconds the following day, you might not want to get |
|
|
|
alerted because the backups are 10 seconds late. |
|
|
|
</p> |
|
|
|
<p>Each check can be in one of the following states:</p> |
|
|
|
|
|
|
|
<table class="table"> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-question-sign new"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>New.</strong> |
|
|
|
A check that has been created, but has not received any pings yet. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-ok-sign up"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>Up.</strong> |
|
|
|
Time since last ping has not exceeded <strong>Period</strong>. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-exclamation-sign grace"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>Late.</strong> |
|
|
|
Time since last ping has exceeded <strong>Period</strong>, |
|
|
|
but has not yet exceeded <strong>Period</strong> + <strong>Grace</strong>. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-exclamation-sign down"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>Down.</strong> |
|
|
|
Time since last ping has exceeded <strong>Period</strong> + <strong>Grace</strong>. |
|
|
|
When check goes from "Late" to "Down", healthchecks.io |
|
|
|
sends you an email alert. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
</table> |
|
|
|
|
|
|
|
</div> |
|
|
|
</div> |
|
|
|
<div class="row"><div class="col-sm-12"> |
|
|
|
|
|
|
|
<h2>Summary</h2> |
|
|
|
<p> |
|
|
|
Each check you create in <a href="{% url 'hc-index' %}">My Checks</a> |
|
|
|
page has an unique "ping" URL. Whenever you access this URL, |
|
|
|
the "Last Ping" value of corresponding check is updated. |
|
|
|
</p> |
|
|
|
<p>When a certain amount of time passes since last received ping, the |
|
|
|
check is considered "late", and Health Checks sends an email alert. |
|
|
|
It is all very simple, really.</p> |
|
|
|
|
|
|
|
<h2>Executing a Ping</h2> |
|
|
|
<p> |
|
|
|
At the end of your batch job, add a bit of code to request |
|
|
|
your ping URL. |
|
|
|
</p> |
|
|
|
<ul> |
|
|
|
<li>HTTP and HTTPS protocols both are fine</li> |
|
|
|
<li>Request method can be GET or POST</li> |
|
|
|
<li>It does not matter what request headers you send</li> |
|
|
|
<li>You can leave request body empty or put anything in it, it's all good</li> |
|
|
|
</ul> |
|
|
|
|
|
|
|
<p>The response will have status code "200 OK" and response body will be a |
|
|
|
short and simple string "OK".</p> |
|
|
|
|
|
|
|
<p> |
|
|
|
Here are examples of executing pings from different environments. |
|
|
|
</p> |
|
|
|
|
|
|
|
<h3>Crontab</h3> |
|
|
|
|
|
|
|
<p> |
|
|
|
When using cron, probably the easiest is to append a <code>curl</code> |
|
|
|
or <code>wget</code> call after your command. The scheduled time comes, |
|
|
|
and your command runs. After it completes, the healthchecks.io check |
|
|
|
gets pinged. |
|
|
|
</p> |
|
|
|
|
|
|
|
{% include "front/snippets/crontab.html" %} |
|
|
|
|
|
|
|
<p>With this simple modification, you monitor several failure |
|
|
|
scenarios:</p> |
|
|
|
|
|
|
|
<ul> |
|
|
|
<li>The whole machine has stopped working (power outage, janitor stumbles on wires, VPS provider problems, etc.) </li> |
|
|
|
<li>cron daemon is not running, or has invalid configuration</li> |
|
|
|
<li>cron does start your task, but the task exits with non-zero exit code</li> |
|
|
|
</ul> |
|
|
|
|
|
|
|
<p>Either way, when your task doesn't finish successfully, you will soon |
|
|
|
know about it.</p> |
|
|
|
|
|
|
|
|
|
|
|
<h3>Bash or a shell script</h3> |
|
|
|
|
|
|
|
<p>Both <code>curl</code> and <code>wget</code> examples accomplish the same |
|
|
|
thing: they fire off a HTTP GET method.</p> |
|
|
|
|
|
|
|
<p> |
|
|
|
If using <code>curl</code>, make sure it is installed on your target system. |
|
|
|
Ubuntu, for example, does not have curl installed out of the box. |
|
|
|
</p> |
|
|
|
|
|
|
|
{% include "front/snippets/bash.html" %} |
|
|
|
|
|
|
|
<h3>Python</h3> |
|
|
|
{% include "front/snippets/python.html" %} |
|
|
|
|
|
|
|
<h3>Node</h3> |
|
|
|
{% include "front/snippets/node.html" %} |
|
|
|
|
|
|
|
|
|
|
|
<h3>PHP</h3> |
|
|
|
{% include "front/snippets/php.html" %} |
|
|
|
|
|
|
|
<h3>Browser</h3> |
|
|
|
<p> |
|
|
|
healthchecks.io includes <code>Access-Control-Allow-Origin:*</code> |
|
|
|
CORS header in its ping responses, so cross-domain AJAX requests |
|
|
|
should work. |
|
|
|
</p> |
|
|
|
{% include "front/snippets/browser.html" %} |
|
|
|
|
|
|
|
<h3>Email</h3> |
|
|
|
<p> |
|
|
|
As an alternative to HTTP/HTTPS requests, |
|
|
|
you can "ping" this check by sending an |
|
|
|
email message to <a href="mailto:{{ check.email }}">{{ check.email }}</a> |
|
|
|
</p> |
|
|
|
<p> |
|
|
|
This is useful for end-to-end testing weekly email delivery. |
|
|
|
</p> |
|
|
|
<p> |
|
|
|
An example scenario: you have a cron job which runs weekly and |
|
|
|
sends weekly email reports to a list of e-mail addresses. You have already |
|
|
|
set up a check to get alerted when your cron job fails to run. |
|
|
|
But what you ultimately want to check is your emails <em>get sent and |
|
|
|
get delivered</em>. |
|
|
|
</p> |
|
|
|
<p> |
|
|
|
The solution: set up another check, and add its |
|
|
|
@hchk.io address to your list of recipient email addresses. Set its |
|
|
|
Period to 1 week. As long as your weekly email script runs correctly, |
|
|
|
the check will be regularly pinged and will stay up. |
|
|
|
</p> |
|
|
|
|
|
|
|
|
|
|
|
<h2>When Alerts Are Sent</h2> |
|
|
|
<p> |
|
|
|
Each check has a configurable <strong>Period</strong> parameter, with the default value of one day. |
|
|
|
For periodic tasks, this is the expected time gap between two runs. |
|
|
|
</p> |
|
|
|
<p> |
|
|
|
Additionally, each check has a <strong>Grace</strong> parameter, with default value of one hour. |
|
|
|
You can use this parameter to account for run time variance of tasks. |
|
|
|
For example, if a backup task completes in 50 seconds one day, and |
|
|
|
completes in 60 seconds the following day, you might not want to get |
|
|
|
alerted because the backups are 10 seconds late. |
|
|
|
</p> |
|
|
|
<p>Each check can be in one of the following states:</p> |
|
|
|
|
|
|
|
<table class="table"> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-question-sign new"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>New.</strong> |
|
|
|
A check that has been created, but has not received any pings yet. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-ok-sign up"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>Up.</strong> |
|
|
|
Time since last ping has not exceeded <strong>Period</strong>. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-exclamation-sign grace"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>Late.</strong> |
|
|
|
Time since last ping has exceeded <strong>Period</strong>, |
|
|
|
but has not yet exceeded <strong>Period</strong> + <strong>Grace</strong>. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
<tr> |
|
|
|
<td> |
|
|
|
<span class="glyphicon glyphicon-exclamation-sign down"></span> |
|
|
|
</td> |
|
|
|
<td> |
|
|
|
<strong>Down.</strong> |
|
|
|
Time since last ping has exceeded <strong>Period</strong> + <strong>Grace</strong>. |
|
|
|
When check goes from "Late" to "Down", healthchecks.io |
|
|
|
sends you an email alert. |
|
|
|
</td> |
|
|
|
</tr> |
|
|
|
</table> |
|
|
|
|
|
|
|
</div></div> |
|
|
|
{% endblock %} |