Recovery alerts matter as much as down alerts
Most teams obsess over the down alert and ignore its other half. Something breaks, the page fires, everyone scrambles. The service comes back, and nobody is formally told. People drift off the call assuming it is fine. The incident never really gets closed, it just gets abandoned.
A down alert without a recovery alert is half a conversation. You know the bad news started. You do not know, with any precision, when it stopped.
Is it back should not be a guess
After an incident, the most common question is whether it is actually resolved or just quiet for a minute. Without a recovery signal, someone keeps refreshing a dashboard to be sure. A recovery alert answers it for everyone at once. The service is back, here is when, you can stand down.
Duration is data
A recovery alert that includes how long the incident lasted hands you something useful for free. That duration is the start of your postmortem, the number on your status page, and the input to whether this breached an SLA. If you only capture the down event, you are reconstructing the duration later from memory.
Close the loop automatically
Vigiles sends both down and recovery notifications, and the recovery comes with the incident duration already attached. Your team knows it is over without checking, your status page updates, and the incident closes with the timing already recorded.
An incident is not done when the bleeding stops. It is done when everyone knows it stopped and how long it lasted. Start free, or see how incident management works.