|
|
- {% extends "front/base_docs.html" %}
- {% load compress static hc_extras %}
-
- {% block title %}Documentation - {% site_name %}{% endblock %}
- {% block description %}
- <meta name="description" content="Monitor any service that can make a HTTP request or send an email: cron jobs, Bash scripts, Python, Ruby, Node, PHP, JS, ...">
- {% endblock %}
- {% block keywords %}
- <meta name="keywords" content="healthchecks, crontab monitoring, python health check, bash health check, cron monitoring, cron tutorial, cron howto, api health check, open source">
- {% endblock %}
-
-
- {% block docs_content %}
-
- <h2>Summary</h2>
- <p>
- Each check in <a href="{% url 'hc-index' %}">My Checks</a>
- page has a unique "ping" URL. Whenever you make a HTTP request to this URL,
- {% site_name %} records the request and updates the "Last Ping" value of
- the corresponding check.
- </p>
-
- <p>When a certain, configurable amount of time passes since last received ping,
- the check is considered "late". {% site_name %} then
- waits for additional time (configured with the "Grace Time" parameter) and,
- if still no ping, sends you an alert.</p>
-
- <p>As long as the monitored service sends pings on time, you receive no
- alerts. As soon as it fails to check in on time, you get notified.
- It is a simple idea.</p>
-
- <h2>Executing a Ping</h2>
- <p>
- At the end of your batch job, add a bit of code to request
- your ping URL.
- </p>
- <ul>
- <li>HTTP and HTTPS protocols both work.
- Prefer HTTPS, but on old systems you may need to fall back to HTTP.</li>
- <li>Request method can be GET, POST or HEAD</li>
- <li>Both IPv4 and IPv6 work</li>
- <li>
- For HTTP POST requests, you can include additional diagnostic information
- for your own reference in the request body. If the request body looks
- like a UTF-8 string, {% site_name %} will log the first 10 kilobytes
- of the request body, so you can inspect it later.
- </li>
- </ul>
-
- <p>The response will have status code "200 OK" and response body will be a
- short and simple string "OK".</p>
-
- <h2>Signalling a Failure</h2>
- <p>
- Append <code>/fail</code> to a ping URL and use it to actively signal a
- failure. Requesting the <code>/fail</code> URL will immediately mark the
- check as "down". You can use this feature to minimize the delay from
- your monitored service failing to you getting a notification.
- </p>
- <p>Below is a skeleton code example in Python which signals a failure when the
- work function returns an unexpected value or throws an exception:</p>
-
- {% include "front/snippets/python_requests_fail.html" %}
-
- <a name="crontab"></a>
- <h3>Crontab</h3>
-
- <p>
- When using cron, probably the easiest is to append a curl
- or wget call after your command. The scheduled time comes,
- and your command runs. If it completes successfully (exit code 0),
- curl or wget runs a HTTP GET call to the ping URL.
- </p>
-
- {% include "front/snippets/crontab.html" %}
-
- <p>With this simple modification, you monitor several failure
- scenarios:</p>
-
- <ul>
- <li>The whole machine has stopped working (power outage, janitor stumbles on wires, VPS provider problems, etc.) </li>
- <li>cron daemon is not running, or has invalid configuration</li>
- <li>cron does start your task, but the task exits with non-zero exit code</li>
- </ul>
-
- <p>Either way, when your task doesn't finish successfully, you will soon
- know about it.</p>
-
- <p>The extra options to curl are meant to suppress any output, unless it hits
- an error. This is to prevent cron from sending an email every time the
- task runs. Feel free to adjust the curl options to your liking.
- </p>
- <table class="table curl-opts">
- <tr>
- <th>&&</th>
- <td>Run curl only if <code>/home/user/backup.sh</code> succeeds</td>
- </tr>
- <tr>
- <th>
- -f, --fail
- </th>
- <td>Makes curl treat non-200 responses as errors</td>
- </tr>
- <tr>
- <th>-s, --silent</th>
- <td>Silent or quiet mode. Don't show progress meter or error messages.</td>
- </tr>
- <tr>
- <th>-S, --show-error</th>
- <td>When used with -s it makes curl show error message if it fails.</td>
- </tr>
- <tr>
- <th>--retry <num></th>
- <td>
- If a transient error is returned when curl tries to perform a
- transfer, it will retry this number of times before giving up.
- Setting the number to 0 makes curl do no retries
- (which is the default). Transient error means either: a timeout,
- an FTP 4xx response code or an HTTP 5xx response code.
- </td>
- </tr>
- <tr>
- <th>> /dev/null</th>
- <td>
- Redirect curl's stdout to /dev/null (error messages go to stderr,)
- </td>
- </tr>
- </table>
-
- <a name="bash"></a>
- <h3>Bash or a shell script</h3>
-
- <p>Both <code>curl</code> and <code>wget</code> examples accomplish the same
- thing: they fire off a HTTP GET method.</p>
-
- <p>
- If using <code>curl</code>, make sure it is installed on your target system.
- Ubuntu, for example, does not have curl installed out of the box.
- </p>
-
- {% include "front/snippets/bash_curl.html" %}
- {% include "front/snippets/bash_wget.html" %}
-
- <a name="python"></a>
- <h3>Python</h3>
-
- <p>
- If you are already using the
- <a href="http://docs.python-requests.org/en/master/">requests</a> library,
- it's convenient to also use it here:
- </p>
- {% include "front/snippets/python_requests.html" %}
-
- <p>
- Otherwise, you can use the <code>urllib</code> standard module.
- </p>
-
- {% include "front/snippets/python_urllib2.html" %}
-
- <a name="ruby"></a>
- <h3>Ruby</h3>
- {% include "front/snippets/ruby.html" %}
-
- <a name="node"></a>
- <h3>Node</h3>
- {% include "front/snippets/node.html" %}
-
- <a name="php"></a>
- <h3>PHP</h3>
- {% include "front/snippets/php.html" %}
-
- <a name="browser"></a>
- <h3>Browser</h3>
- <p>
- {% site_name %} includes <code>Access-Control-Allow-Origin:*</code>
- CORS header in its ping responses, so cross-domain AJAX requests
- should work.
- </p>
- {% include "front/snippets/browser.html" %}
-
- <a name="powershell"></a>
- <h3>PowerShell</h3>
- <p>
- You can use <a href="https://msdn.microsoft.com/en-us/powershell/mt173057.aspx">PowerShell</a>
- and Windows Task Scheduler to automate various tasks on a Windows system.
- From within a PowerShell script it is also easy to ping {% site_name %}.
- </p>
- <p>Here is a simple PowerShell script that pings {% site_name %}.
- When scheduled to run with Task Scheduler, it will essentially
- just send regular "I'm alive" messages. You can of course extend it to
- do more things.</p>
-
- {% include "front/snippets/powershell.html" %}
-
- <p>Save the above to e.g. <code>C:\Scripts\healthchecks.ps1</code>. Then use
- the following command in a Scheduled Task to run the script:
- </p>
-
- <div class="highlight">
- <pre>powershell.exe -ExecutionPolicy bypass -File C:\Scripts\healthchecks.ps1</pre>
- </div>
-
- <p>In simple cases, you can also pass the script to PowerShell directly,
- using the "-command" argument:</p>
-
- {% include "front/snippets/powershell_inline.html" %}
-
- <a name="email"></a>
- <h3>Email</h3>
- <p>
- As an alternative to HTTP/HTTPS requests,
- you can "ping" this check by sending an
- email message to <strong>{{ ping_email }}</strong>
- </p>
- <p>
- This is useful for end-to-end testing weekly email delivery.
- </p>
- <p>
- An example scenario: you have a cron job which runs weekly and
- sends weekly email reports to a list of e-mail addresses. You have already
- set up a check to get alerted when your cron job fails to run.
- But what you ultimately want to check is your emails <em>get sent and
- get delivered</em>.
- </p>
- <p>
- The solution: set up another check, and add its
- @hchk.io address to your list of recipient email addresses. Set its
- Period to 1 week. As long as your weekly email script runs correctly,
- the check will be regularly pinged and will stay up.
- </p>
-
-
- <h2>When Alerts Are Sent</h2>
- <p>
- Each check has a configurable <strong>Period</strong> parameter, with the default value of one day.
- For periodic tasks, this is the expected time gap between two runs.
- </p>
- <p>
- Additionally, each check has a <strong>Grace</strong> parameter, with default value of one hour.
- You can use this parameter to account for run time variance of tasks.
- For example, if a backup task completes in 50 seconds one day, and
- completes in 60 seconds the following day, you might not want to get
- alerted because the backups are 10 seconds late.
- </p>
- <p>Each check can be in one of the following states:</p>
-
- <table class="table">
- <tr>
- <td>
- <span class="status icon-new"></span>
- </td>
- <td>
- <strong>New.</strong>
- A check that has been created, but has not received any pings yet.
- </td>
- </tr>
- <tr>
- <td>
- <span class="status icon-paused"></span>
- </td>
- <td>
- <strong>Monitoring Paused.</strong>
- You can resume monitoring of a paused check by pinging it.
- </td>
- </tr>
- <tr>
- <td>
- <span class="status icon-up"></span>
- </td>
- <td>
- <strong>Up.</strong>
- Time since last ping has not exceeded <strong>Period</strong>.
- </td>
- </tr>
- <tr>
- <td>
- <span class="status icon-grace"></span>
- </td>
- <td>
- <strong>Late.</strong>
- Time since last ping has exceeded <strong>Period</strong>,
- but has not yet exceeded <strong>Period</strong> + <strong>Grace</strong>.
- </td>
- </tr>
- <tr>
- <td>
- <span class="status icon-down"></span>
- </td>
- <td>
- <strong>Down.</strong>
- Time since last ping has exceeded <strong>Period</strong> + <strong>Grace</strong>.
- When check goes from "Late" to "Down", {% site_name %}
- sends you an alert.
- </td>
- </tr>
- </table>
-
- {% endblock %}
-
- {% block scripts %}
- {% compress js %}
- <script src="{% static 'js/jquery-2.1.4.min.js' %}"></script>
- <script src="{% static 'js/bootstrap.min.js' %}"></script>
- <script src="{% static 'js/clipboard.min.js' %}"></script>
- <script src="{% static 'js/snippet-copy.js' %}"></script>
- {% endcompress %}
- {% endblock %}
|