Talking head
DevOpsDays Philadelphia 2018

This presentation, by Mauricio Linhares, is licensed under a Creative Commons Attribution ShareAlike 3.0

In this presentation we’ll learn what are the most important metrics we should be measuring in our systems (upper and lower bounds, SLAs/SLOs), what is the purpose of having dashboards, how different consumers will need different dashboards and why dashboards are for gathering more information about outages and not to figure out there is one outage happening, and, sadly, alerting. What to think about before including a new alert (can we automate the response? is it really actionable? do we have expectations for when it will trigger) and avoiding alerting burnout. The main goal is to help teams and managers to make sense of their data by collecting meaningful information, showing it in a way that is useful for all parties involved and not drowning teams on noise.

Rated: Everyone
Viewed 249 times
Tags: There are no tags for this video.