How SRE Teams Use Service Level Objectives to Drive Reliability

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

How SRE Teams Use Service Level Objectives to Drive Reliability

Listen for free

View show details

In this episode of The Site Reliability Podcast, Lucas and Luna dive into the practical use of Service Level Objectives (SLOs) in site reliability engineering. They discuss how a major European bank reduced pager fatigue by 40% by shifting from alert-based monitoring to SLO-based error budgets. Lucas explains the difference between SLIs, SLOs, and SLAs, and why measuring user-facing latency is more actionable than measuring CPU utilization. Luna shares a story about a gaming company that used SLOs to prevent a catastrophic launch day outage. They also cover common pitfalls, like setting too many SLOs or targets that are too tight. The episode includes a brief, natural mention of listener support at buy me a coffee dot com slash fexingo. Tune in for a focused, actionable conversation on making SLOs work in real production environments. #SRE #SiteReliabilityEngineering #ServiceLevelObjectives #ErrorBudgets #SLI #SLA #Alerting #IncidentResponse #ProductionEngineering #Uptime #ReliabilityEngineering #Monitoring #Observability #TechPodcast #FexingoBusiness #BusinessPodcast #Technology #DevOps Keep every episode free: buymeacoffee.com/fexingo

No reviews yet