Skip to main content
AdvancedExpert5 weeks

Site Reliability Engineering (SRE)

Site Reliability Engineering (SRE) turns SLI, SLO, SLA into practical infrastructure skill for reliable production systems.

Topic 22 of 29

Prerequisites

  • Platform Engineering

Key Concepts & Skills

  • SLI
  • SLO
  • SLA
  • Error Budgets
  • Reliability Engineering
  • Operate SLI in production-like environments
  • Connect SLO to infrastructure workflows
  • Troubleshoot failures with repeatable runbooks
  • Document operational tradeoffs and risks

Learning Outcomes

  • Explain how Site Reliability Engineering (SRE) impacts reliability and delivery
  • Build or configure a lab around SLI
  • Identify common failure modes and mitigation strategies

Resources

Official Docs

Open Source Projects

Practice Exercises

Project Task

Run a site reliability engineering (sre) lab in a local or cloud sandbox.

Quiz