You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

74 lines
5.0 KiB

  1. <h1>Configuring Checks</h1>
  2. <p>In SITE_NAME, a <strong>Check</strong> represents a single service you want to
  3. monitor. For example, when monitoring cron jobs, you would create a separate check for
  4. each cron job to be monitored. SITE_NAME pricing plans are structured primarily
  5. around how many checks you can have in your account. You can create checks
  6. either in SITE_NAME web interface or via <a href="../api/">Management API</a>.</p>
  7. <h2>Name, Tags, Description</h2>
  8. <p>Describe each check using an optional name, tags, and description fields.</p>
  9. <p><img alt="Editing name, tags and description" src="IMG_URL/edit_name.png" /></p>
  10. <ul>
  11. <li><strong>Name</strong>: names are optional, but it is a good idea to set them.
  12. Good naming becomes especially important as you add more checks to the
  13. account. SITE_NAME will display check names in the web interface, in email reports,
  14. and notifications.</li>
  15. <li><strong>Tags</strong>: a space-separated list of optional labels. Use tags to organize and group
  16. checks within a project. You can tag checks by the environment
  17. (<code>prod</code>, <code>staging</code>, <code>dev</code>, etc.) or by role (<code>www</code>, <code>db</code>, <code>worker</code>, etc.) or using
  18. any other system.</li>
  19. <li><strong>Description</strong>: a free-form text field with any related information for your team
  20. or your future self. Describe the cron job's role, who set it up, what to do in
  21. case of failures, where to look for additional information.</li>
  22. </ul>
  23. <h2>Simple Schedules</h2>
  24. <p>SITE_NAME supports two types of schedules: <strong>Simple</strong> and <strong>Cron</strong>. Use Simple
  25. schedules for monitoring processes that you expect to run at relatively regular time
  26. intervals: once an hour, once a day, once a week.</p>
  27. <p><img alt="Editing the period and grace time" src="IMG_URL/edit_simple_schedule.png" /></p>
  28. <p>For the simple schedules, you can configure two parameters, Period and Grace Time.</p>
  29. <ul>
  30. <li><strong>Period</strong> is the expected time between pings.</li>
  31. <li><strong>Grace Time</strong> is the additional time to wait before sending an alert when a check
  32. is late. Use this parameter to account for small, expected deviations in job
  33. execution times.</li>
  34. </ul>
  35. <p>Note: if you use the "start" signal to <a href="../measuring_script_run_time/">measure job run times</a>,
  36. then Grace Time also specifies the maximum allowed time gap between "start" and
  37. "success" signals. Whenever SITE_NAME receives a "start" signal, it expects to
  38. receive a subsequent "success" signal within Grace Time. If the success signal does
  39. not arrive within the configured Grace Time, SITE_NAME will mark the check as failed
  40. and send out alerts.</p>
  41. <h2>Cron Schedules</h2>
  42. <p>Use "cron" for monitoring processes with more complex schedules. This monitoring mode
  43. ensures that jobs run <strong>at the correct time</strong>, and not just at correct time intervals.</p>
  44. <p><img alt="Editing cron schedule" src="IMG_URL/edit_cron_schedule.png" /></p>
  45. <p>You will need to specify Cron Expression, Server's Time Zone, and Grace Time.</p>
  46. <ul>
  47. <li><strong>Cron Expression</strong> is the cron expression you specified in the crontab.</li>
  48. <li><strong>Server's Time Zone</strong> is the timezone of your server. The cron daemon typically uses
  49. system's local time. If the machine is not using the UTC timezone, you need to
  50. specify it here.</li>
  51. <li><strong>Grace Time</strong>, same as for simple schedules, is how long to wait before sending an
  52. alert for a late check.</li>
  53. </ul>
  54. <h2>Filtering Rules</h2>
  55. <p>In the "Filtering Rules" dialog, you can control several advanced aspects of
  56. how SITE_NAME handles incoming pings for a particular check.</p>
  57. <p><img alt="Setting filtering rules" src="IMG_URL/filtering_rules.png" /></p>
  58. <ul>
  59. <li><strong>Allowed request methods for HTTP requests</strong>. You can require the ping
  60. requests to use HTTP POST. Use the "Only POST" option if you run into issues of
  61. preview bots hitting the ping URLs when you send them in email or post them in chat.</li>
  62. <li><strong>Filter by keywords in the Subject line</strong>. When pinging <a href="../email/">via email</a>,
  63. look for specific keywords in the subject line. If the subject line contains any of
  64. the keywords listed in <strong>Success Keywords</strong>, SITE_NAME will assume it to be a success
  65. signal. Likewise, if it contains any of the keywords listed in <strong>Failure Keywords</strong>,
  66. SITE_NAME will treat it as an explicit failure signal.
  67. For example, this is useful if your backup software sends an email after each backup
  68. run with a different subject line depending on success or failure.</li>
  69. <li><strong>Pinging a Paused Check</strong>. Normally, when you ping a paused check, it leaves the
  70. paused state and goes into the "up" state (or the "down" state
  71. in case of <a href="../signaling_failures/">a failure signal</a>).
  72. You can change this behavior by selecting the "Ignore the ping, stay in
  73. the paused state" option. With this option selected, the paused state becomes "sticky":
  74. SITE_NAME will ignore all incoming pings until you explicitly <em>resume</em> the check.</li>
  75. </ul>