Browse Source

Improve docs, addd "Concepts" section

cc: #547
pull/551/head
Pēteris Caune 3 years ago
parent
commit
234b681df8
No known key found for this signature in database GPG Key ID: E28D7679E9A9EDE2
7 changed files with 175 additions and 15 deletions
  1. +9
    -0
      static/css/docs.css
  2. +6
    -5
      templates/docs/configuring_checks.html
  3. +6
    -5
      templates/docs/configuring_checks.md
  4. +65
    -1
      templates/docs/introduction.html
  5. +85
    -0
      templates/docs/introduction.md
  6. +2
    -2
      templates/docs/measuring_script_run_time.html
  7. +2
    -2
      templates/docs/measuring_script_run_time.md

+ 9
- 0
static/css/docs.css View File

@ -109,6 +109,7 @@ h2.rule {
.page-docs dt, .page-docs dd {
border-top: 1px solid var(--border-color);
padding: 8px 0;
line-height: 1.8;
}
.rule + p code {
@ -116,3 +117,11 @@ h2.rule {
font-weight: bold;
padding: 2px 4px;
}
.docs-introduction dl {
grid-template-columns: 48px auto;
}
.docs-introduction dt {
text-align: center;
}

+ 6
- 5
templates/docs/configuring_checks.html View File

@ -3,7 +3,7 @@
monitor. For example, when monitoring cron jobs, you would create a separate check for
each cron job to be monitored. SITE_NAME pricing plans are structured primarily
around how many checks you can have in your account. You can create checks
either in SITE_NAME web interface or by calling <a href="../api/">Management API</a>.</p>
either in SITE_NAME web interface or via <a href="../api/">Management API</a>.</p>
<h2>Name, Tags, Description</h2>
<p>Describe each check using an optional name, tags, and description fields.</p>
<p><img alt="Editing name, tags and description" src="IMG_URL/edit_name.png" /></p>
@ -33,10 +33,11 @@ is late. Use this parameter to account for small, expected deviations in job
execution times.</li>
</ul>
<p>Note: if you use the "start" signal to <a href="../measuring_script_run_time/">measure job run times</a>,
then Grace Time also specifies how long the job is expected to run. Whenever SITE_NAME
receives a "start" signal, it expects to receive a subsequent "success" signal
within Grace Time. If the success signal does not arrive within the configured
Grace Time, SITE_NAME will mark the check as failed and send out alerts.</p>
then Grace Time also specifies the maximum allowed time gap between "start" and
"success" signals. Whenever SITE_NAME receives a "start" signal, it expects to
receive a subsequent "success" signal within Grace Time. If the success signal does
not arrive within the configured Grace Time, SITE_NAME will mark the check as failed
and send out alerts.</p>
<h2>Cron Schedules</h2>
<p>Use "cron" for monitoring processes with more complex schedules. This monitoring mode
ensures that jobs run <strong>at the correct time</strong>, and not just at correct time intervals.</p>


+ 6
- 5
templates/docs/configuring_checks.md View File

@ -4,7 +4,7 @@ In SITE_NAME, a **Check** represents a single service you want to
monitor. For example, when monitoring cron jobs, you would create a separate check for
each cron job to be monitored. SITE_NAME pricing plans are structured primarily
around how many checks you can have in your account. You can create checks
either in SITE_NAME web interface or by calling [Management API](../api/).
either in SITE_NAME web interface or via [Management API](../api/).
## Name, Tags, Description
@ -40,10 +40,11 @@ is late. Use this parameter to account for small, expected deviations in job
execution times.
Note: if you use the "start" signal to [measure job run times](../measuring_script_run_time/),
then Grace Time also specifies how long the job is expected to run. Whenever SITE_NAME
receives a "start" signal, it expects to receive a subsequent "success" signal
within Grace Time. If the success signal does not arrive within the configured
Grace Time, SITE_NAME will mark the check as failed and send out alerts.
then Grace Time also specifies the maximum allowed time gap between "start" and
"success" signals. Whenever SITE_NAME receives a "start" signal, it expects to
receive a subsequent "success" signal within Grace Time. If the success signal does
not arrive within the configured Grace Time, SITE_NAME will mark the check as failed
and send out alerts.
## Cron Schedules


+ 65
- 1
templates/docs/introduction.html View File

@ -24,4 +24,68 @@ run continuously or on a regular, known schedule. For example:</p>
<li>collecting application performance metrics</li>
<li>error tracking</li>
<li>log aggregation</li>
</ul>
</ul>
<h2>Concepts</h2>
<p>A <strong>Check</strong> represents a single service you want to monitor. For example, when
<a href="monitoring_cron_jobs/">monitoring cron jobs</a>, you would create a separate check for
each cron job to be monitored. Each check has a unique ping URL, a set schedule,
and associated integrations. For the available configuration options, see
<a href="configuring_checks/">Configuring checks</a>.</p>
<p>Each check is always in one of the following states, depicted by a status icon:</p>
<dl>
<dt><span class="status ic-new"></span></dt>
<dd><strong>New</strong>. A newly created check that has not received any pings yet. Each new
check you create will start in this state.</dd>
<dt><span class="status ic-up"></span></dt>
<dd><strong>Up</strong>. All is well, the last "success" signal has arrived on time.</dd>
<dt><span class="status ic-grace"></span></dt>
<dd><strong>Late</strong>. The "success" signal is due but has not arrived yet.
It is not yet late by more than the check's configured <strong>Grace Time</strong>.</dd>
<dt><span class="status ic-down"></span></dt>
<dd><strong>Down</strong>. The "success" signal has not arrived yet, and the Grace Time has elapsed.
When a check transitions into the "Down" state, SITE_NAME sends out alert
messages via the configured integrations.</dd>
<dt><span class="status ic-paused"></span></dt>
<dd><strong>Paused</strong>. You can manually pause the monitoring of specific checks. For example,
if a frequently running cron job has a known problem, and a fix is scheduled but
not yet ready, you can pause monitoring of the corresponding check temporarily to
avoid unwanted alerts about a known issue.</dd>
<dt><span class="status ic-up"></span><div class="spinner started"><div class="d1"></div><div class="d2"></div><div class="d3"></div></div></dt>
<dd>Additionally, if the most recent received signal is a "start" signal,
this will be indicated by three animated dots under check's status icon.</dd>
</dl>
<hr />
<p><strong>Ping URL</strong>. Each check has a unique <strong>Ping URL</strong>. Clients (cron jobs, background
workers, batch scripts, scheduled tasks, web services) make HTTP requests to the
ping URL to signal a start of the execution, a success, or a failure.</p>
<p>While the "success" signals are essential, "start" and "failure" are optional.
You don't have to use them, but you can gain additional monitoring insights
by using them. See <a href="measuring_script_run_time/">Measuring script run time</a> and
<a href="signaling_failures/">Signaling failures</a> for details.</p>
<p>You should treat ping URLs as secrets. If you make them public, anybody can send
telemetry signals to your checks and mess with your monitoring.</p>
<hr />
<p><strong>Grace Time</strong> is one of the configuration parameters you can set for each check.
It is the additional time to wait before sending an alert when a check
is late. Use this parameter to account for small, expected deviations in job
execution times. If you use "start" signals to
<a href="measuring_script_run_time/">measure job execution time</a>, Grace Time also sets the
maximum allowed time gap between "start" and "success" signals. If a job
sends a "start" signal but then does not send a "success" signal within grace time,
SITE_NAME will assume the job has failed, and send out alerts.</p>
<hr />
<p>An <strong>Integration</strong> is a specific method for delivering monitoring alerts when checks
change states. SITE_NAME supports many different types of integrations: email,
webhooks, SMS, Slack, PagerDuty, etc. You can set up multiple integrations.
For each check, you can specify which integrations it should use.</p>
<p>For more information on integrations, see
<a href="configuring_notifications/">Configuring notifications</a>.</p>
<hr />
<p><strong>Project</strong>. To keep things organized, you can group checks and integrations in <strong>Projects</strong>.
Your account starts with a single default project, but you can create any number
of additional projects as needed. You can transfer existing checks between projects
while preserving their configuration and ping URL.</p>
<p>Each project has a configurable name, a separate set of API keys, and a separate
project team. The project's team is the set of people you have granted read-only or
read-write access to the project.</p>
<p>For more information on projects, see <a href="projects_teams/">Projects and teams</a>.</p>

+ 85
- 0
templates/docs/introduction.md View File

@ -25,3 +25,88 @@ SITE_NAME is *not* the right tool for:
* collecting application performance metrics
* error tracking
* log aggregation
## Concepts
A **Check** represents a single service you want to monitor. For example, when
[monitoring cron jobs](monitoring_cron_jobs/), you would create a separate check for
each cron job to be monitored. Each check has a unique ping URL, a set schedule,
and associated integrations. For the available configuration options, see
[Configuring checks](configuring_checks/).
Each check is always in one of the following states, depicted by a status icon:
<span class="status ic-new"></span>
: **New**. A newly created check that has not received any pings yet. Each new
check you create will start in this state.
<span class="status ic-up"></span>
: **Up**. All is well, the last "success" signal has arrived on time.
<span class="status ic-grace"></span>
: **Late**. The "success" signal is due but has not arrived yet.
It is not yet late by more than the check's configured **Grace Time**.
<span class="status ic-down"></span>
: **Down**. The "success" signal has not arrived yet, and the Grace Time has elapsed.
When a check transitions into the "Down" state, SITE_NAME sends out alert
messages via the configured integrations.
<span class="status ic-paused"></span>
: **Paused**. You can manually pause the monitoring of specific checks. For example,
if a frequently running cron job has a known problem, and a fix is scheduled but
not yet ready, you can pause monitoring of the corresponding check temporarily to
avoid unwanted alerts about a known issue.
<span class="status ic-up"></span><div class="spinner started"><div class="d1"></div><div class="d2"></div><div class="d3"></div></div>
: Additionally, if the most recent received signal is a "start" signal,
this will be indicated by three animated dots under check's status icon.
---
**Ping URL**. Each check has a unique **Ping URL**. Clients (cron jobs, background
workers, batch scripts, scheduled tasks, web services) make HTTP requests to the
ping URL to signal a start of the execution, a success, or a failure.
While the "success" signals are essential, "start" and "failure" are optional.
You don't have to use them, but you can gain additional monitoring insights
by using them. See [Measuring script run time](measuring_script_run_time/) and
[Signaling failures](signaling_failures/) for details.
You should treat ping URLs as secrets. If you make them public, anybody can send
telemetry signals to your checks and mess with your monitoring.
---
**Grace Time** is one of the configuration parameters you can set for each check.
It is the additional time to wait before sending an alert when a check
is late. Use this parameter to account for small, expected deviations in job
execution times. If you use "start" signals to
[measure job execution time](measuring_script_run_time/), Grace Time also sets the
maximum allowed time gap between "start" and "success" signals. If a job
sends a "start" signal but then does not send a "success" signal within grace time,
SITE_NAME will assume the job has failed, and send out alerts.
---
An **Integration** is a specific method for delivering monitoring alerts when checks
change states. SITE_NAME supports many different types of integrations: email,
webhooks, SMS, Slack, PagerDuty, etc. You can set up multiple integrations.
For each check, you can specify which integrations it should use.
For more information on integrations, see
[Configuring notifications](configuring_notifications/).
---
**Project**. To keep things organized, you can group checks and integrations in **Projects**.
Your account starts with a single default project, but you can create any number
of additional projects as needed. You can transfer existing checks between projects
while preserving their configuration and ping URL.
Each project has a configurable name, a separate set of API keys, and a separate
project team. The project's team is the set of people you have granted read-only or
read-write access to the project.
For more information on projects, see [Projects and teams](projects_teams/).

+ 2
- 2
templates/docs/measuring_script_run_time.html View File

@ -3,10 +3,10 @@
After receiving a start signal, Healthchecks.io will show the check as "Started."
It will store the "start" events and display the job execution times. SITE_NAME
calculates the job execution times as the time gaps between adjacent "start" and
"complete" events.</p>
"success" events.</p>
<h2>Alerting Logic</h2>
<p>SITE_NAME applies an additional alerting rule for jobs that use the <code>/start</code> signal.</p>
<p>If a job sends a "start" signal, but then does not send a "complete"
<p>If a job sends a "start" signal, but then does not send a "success"
signal within its configured grace time, SITE_NAME will assume the job
has failed. It will mark the job as "down" and send out alerts.</p>
<h2>Usage Example</h2>


+ 2
- 2
templates/docs/measuring_script_run_time.md View File

@ -4,13 +4,13 @@
After receiving a start signal, Healthchecks.io will show the check as "Started."
It will store the "start" events and display the job execution times. SITE_NAME
calculates the job execution times as the time gaps between adjacent "start" and
"complete" events.
"success" events.
## Alerting Logic
SITE_NAME applies an additional alerting rule for jobs that use the `/start` signal.
If a job sends a "start" signal, but then does not send a "complete"
If a job sends a "start" signal, but then does not send a "success"
signal within its configured grace time, SITE_NAME will assume the job
has failed. It will mark the job as "down" and send out alerts.


Loading…
Cancel
Save