The Shovel.Company Journal

Share this post
Alert fatigue and the need for an alert inbox
blog.shovel.company

Alert fatigue and the need for an alert inbox

Constant alerting from monitoring systems leads to a poor developer experience

Rishi Ayyer
Apr 26, 2022
Share this post
Alert fatigue and the need for an alert inbox
blog.shovel.company

The developer experience of alert management is broken

Monitoring setups, across infrastructure, application, performance, data and security systems, generate 100’s of alert notifications daily. Production systems have monitoring and alerting configured to notify devs when certain metrics are in alarm state. Responding to alerts is time-critical to ensure undisrupted customer experience and low impact to business.

Alert notifications are funnelled to Slack or other communication tools to reach teams quickly

Alerts with low signal to noise ratio start flooding channels. Over a period of time, mild warnings or information notifications form a bulk of alerts received, and spammy channels are muted or ignored. 

We all have that muted slack channel which is the catacomb of buried alerts

The wall of text in these channels is tough to consume and results in degraded developer experience with excessive fatigue. 

Symptoms that you have alert fatigue

Constant alert notifications increase context switching leading to higher cognitive load. This is especially true if your channels are pinging non-stop, and a large proportion of notifications aren’t relevant or actionable. 

Excessive fatigue from alerting leads increases chances of missing critical alerts that may have been spotted quickly in a noise-free environment. In busy alert channels with high spam, tracing and debugging the right issues takes time. 

Once a critical alert is identified, debugging is another thing. Alert notifications on slack channels are not first-class entities. If an alert has been resolved in the past, there is no history or context of who worked on it, or how it was mitigated in the past.

Identifying and debugging actionable alerts can also take way too much time, contributing to high MTTA and MTTR.  

Manifestation of alert fatigue in developer teams

Teams fatigued by alerts don’t need to make many mistakes to land there. It is quite common to slowly move from a manageable situation to a state of constant fatigue. Typically, teams mature through four stages of monitoring states:

Good Alerts

As engineering systems evolve, certain incidents and issues cause downtimes. Root cause analysis of such incidents leads to setting up monitoring and alerting to warn the team in case of repeating incidents. Over time, more alerts get added with incidents and standard alerts get added across components and services.

Bad notifications

Developers start getting more alert notifications as more alerts are added. Noise starts creeping and the quality of alerts starts reducing over time. Notifications repeat, and different systems may concurrently send alerts when something goes down. 

A large number of alerts are non-actionable which causes devs to be distracted by notifications. Alert notifications start being ignored to avoid distractions.

Ignored notifications recur more often and kickstart the downward spiral of noisy notifications

Broken interfaces

Alert fatigue is amplified by the current notification flow. Devs receive alerts on Slack or other communication channels, where alert information is fragmented and buried within the volume of other notifications. 

Current interfaces do not empower the dev to solve alerts quickly. They are not suited for tracing or collaborating as they miss important alert context. There is no task management or in-place modification to edit alerts based on observed patterns

As volume of alert notifications increase, these channels compound noise and make it close to impossible to make sense of the current state of monitoring.

Missing Feedback Loop

Alerting is an iterative process and not a one-time setup and consume flow. 

It is tough to forecast the usefulness, accuracy and volume of notifications generated while creating an alert. With time, as entropy of the system increases and systems evolve, constant tweaks are needed to keep alerts relevant and actionable. 

With a missing feedback loop, ignored alerts return periodically, dealing the final blow for fatigue. 


We need a new approach for breezing through alert notifications and keeping them actionable. We deserve a better inbox to enhance the alert management experience.

If you or your teams are currently working through noisy alert channels and are looking for a smart alternative, reach out to us and get a sneak-peek of what we’re working on!

Share this post
Alert fatigue and the need for an alert inbox
blog.shovel.company
Comments
TopNew

No posts

Ready for more?

© 2023 Shovel.Company
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing