Self-Care for Systems at Any Scale

Lightning talk during DevOps Days LA 2019 at the Southern California Linux Expo 17x in Pasadena.

https://www.socallinuxexpo.org/scale/17x/presentations/self-care-systems-any-scale

Video (unedited from stream - starts at 3:35:28)

youtube link

Talk Abstract

Maintenance is not exciting or revolutionary, but still remains as a critical component in your systems infrastructure. This talk will review some approaches for maintaining healthy, happy, and well-tuned systems, whether you are caring for pets or cattle, or a mix of both.

Modern infrastructure has evolved from on premise hardware and data centers to hybrid environments, cloud infrastructure, containers, servers, serverless, and more—often with various SaaS products thrown into the mix for fun and profit. Software solutions are implemented in production infrastructure before they have even hit an official 1.0 release. These rapid cycles of change are exciting, but moving fast isn’t quite as fun when you are the one fixing a broken system with little documentation after getting paged (aka text, call, or notification) at 2:00 AM. Awareness of the health and wellness of your systems throughout the software and infrastructure lifecycle will enable your development team to continue at a fast pace while keeping production systems online and customers happy. Whether you are a team of one that is responsible for architecture, deployment, and maintenance, or come from a large organization with dedicated operations, support, and SRE teams, maintenance is crucial to ensuring that your systems stay up and running while you are sound asleep.

Slides