You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

90 lines
5.4 KiB

  1. <h1>SITE_NAME Documentation</h1>
  2. <p>SITE_NAME is a service for monitoring cron jobs and similar periodic processes:</p>
  3. <ul>
  4. <li>SITE_NAME <strong>listens for HTTP requests ("pings")</strong> from your cron jobs and scheduled
  5. tasks.</li>
  6. <li>It <strong>keeps silent</strong> as long as pings arrive on time.</li>
  7. <li>It <strong>raises an alert</strong> as soon as a ping does not arrive on time.</li>
  8. </ul>
  9. <p>SITE_NAME works as a <a href="https://en.wikipedia.org/wiki/Dead_man%27s_switch">dead man's switch</a> for processes that need to
  10. run continuously or on a regular, known schedule. For example:</p>
  11. <ul>
  12. <li>filesystem backups, database backups</li>
  13. <li>task queues</li>
  14. <li>database replication status</li>
  15. <li>report generation scripts</li>
  16. <li>periodic data import and sync jobs</li>
  17. <li>periodic antivirus scans</li>
  18. <li>DDNS updater scripts</li>
  19. <li>SSL renewal scripts</li>
  20. </ul>
  21. <p>SITE_NAME is <em>not</em> the right tool for:</p>
  22. <ul>
  23. <li>monitoring website uptime by probing it with HTTP requests</li>
  24. <li>collecting application performance metrics</li>
  25. <li>error tracking</li>
  26. <li>log aggregation</li>
  27. </ul>
  28. <h2>Concepts</h2>
  29. <p>A <strong>Check</strong> represents a single service you want to monitor. For example, when
  30. <a href="monitoring_cron_jobs/">monitoring cron jobs</a>, you would create a separate check for
  31. each cron job to be monitored. Each check has a unique ping URL, a set schedule,
  32. and associated integrations. For the available configuration options, see
  33. <a href="configuring_checks/">Configuring checks</a>.</p>
  34. <p>Each check is always in one of the following states, depicted by a status icon:</p>
  35. <dl>
  36. <dt><span class="status ic-new"></span></dt>
  37. <dd><strong>New</strong>. A newly created check that has not received any pings yet. Each new
  38. check you create will start in this state.</dd>
  39. <dt><span class="status ic-up"></span></dt>
  40. <dd><strong>Up</strong>. All is well, the last "success" signal has arrived on time.</dd>
  41. <dt><span class="status ic-grace"></span></dt>
  42. <dd><strong>Late</strong>. The "success" signal is due but has not arrived yet.
  43. It is not yet late by more than the check's configured <strong>Grace Time</strong>.</dd>
  44. <dt><span class="status ic-down"></span></dt>
  45. <dd><strong>Down</strong>. The "success" signal has not arrived yet, and the Grace Time has elapsed.
  46. When a check transitions into the "Down" state, SITE_NAME sends out alert
  47. messages via the configured integrations.</dd>
  48. <dt><span class="status ic-paused"></span></dt>
  49. <dd><strong>Paused</strong>. You can manually pause the monitoring of specific checks. For example,
  50. if a frequently running cron job has a known problem, and a fix is scheduled but
  51. not yet ready, you can pause monitoring of the corresponding check temporarily to
  52. avoid unwanted alerts about a known issue.</dd>
  53. <dt><span class="status ic-up"></span><div class="spinner started"><div class="d1"></div><div class="d2"></div><div class="d3"></div></div></dt>
  54. <dd>Additionally, if the most recent received signal is a "start" signal,
  55. this will be indicated by three animated dots under check's status icon.</dd>
  56. </dl>
  57. <hr />
  58. <p><strong>Ping URL</strong>. Each check has a unique <strong>Ping URL</strong>. Clients (cron jobs, background
  59. workers, batch scripts, scheduled tasks, web services) make HTTP requests to the
  60. ping URL to signal a start of the execution, a success, or a failure.</p>
  61. <p>While the "success" signals are essential, "start" and "failure" are optional.
  62. You don't have to use them, but you can gain additional monitoring insights
  63. by using them. See <a href="measuring_script_run_time/">Measuring script run time</a> and
  64. <a href="signaling_failures/">Signaling failures</a> for details.</p>
  65. <p>You should treat ping URLs as secrets. If you make them public, anybody can send
  66. telemetry signals to your checks and mess with your monitoring.</p>
  67. <hr />
  68. <p><strong>Grace Time</strong> is one of the configuration parameters you can set for each check.
  69. It is the additional time to wait before sending an alert when a check
  70. is late. Use this parameter to account for small, expected deviations in job
  71. execution times. If you use "start" signals to
  72. <a href="measuring_script_run_time/">measure job execution time</a>, Grace Time also sets the
  73. maximum allowed time gap between "start" and "success" signals. If a job
  74. sends a "start" signal but then does not send a "success" signal within grace time,
  75. SITE_NAME will assume the job has failed, and send out alerts.</p>
  76. <hr />
  77. <p>An <strong>Integration</strong> is a specific method for delivering monitoring alerts when checks
  78. change states. SITE_NAME supports many different types of integrations: email,
  79. webhooks, SMS, Slack, PagerDuty, etc. You can set up multiple integrations.
  80. For each check, you can specify which integrations it should use.</p>
  81. <p>For more information on integrations, see
  82. <a href="configuring_notifications/">Configuring notifications</a>.</p>
  83. <hr />
  84. <p><strong>Project</strong>. To keep things organized, you can group checks and integrations in <strong>Projects</strong>.
  85. Your account starts with a single default project, but you can create any number
  86. of additional projects as needed. You can transfer existing checks between projects
  87. while preserving their configuration and ping URL.</p>
  88. <p>Each project has a configurable name, a separate set of API keys, and a separate
  89. project team. The project's team is the set of people you have granted read-only or
  90. read-write access to the project.</p>
  91. <p>For more information on projects, see <a href="projects_teams/">Projects and teams</a>.</p>