Ask most SREs how many incidents they’d have to respond to in a perfect world, and their answer would probably be ‘zero.’ After all, making software and infrastructure so reliable that incidents never occur is the dream that SREs are theoretically chasing. Reducing the number of actual incidents as much as possible is a noble […]
How to Adopt an SRE Practice (When You’re not Google)
Site reliability engineering (SRE) isn’t a new term or practice. The practice of applying software engineering skills and principles to operations problems and tasks happened even before site reliability engineer was a defined job title. But organizing a proactive approach to building and maintaining software drives long-term success in improving operational efficiency, data-driven roadmap planning […]
The Pros and Cons of Embedded SREs
To embed or not to embed: That is the question. At least, that’s one of the questions that companies have to answer as they decide how to implement site reliability engineering. They can either embed SREs into existing teams, or they can build a new, separate team. Both approaches have their pros and cons. The […]
The Evolution of Incident Management
Have you ever thought about the history of incident management? If you’re an SRE, you might be so caught up in the day-to-day work of managing reliability and responding to incidents that you never take time to step back and reflect on the evolution of your role and your responsibilities. And that’s a shame because […]
Site Reliability Engineering (SRE) Comes of Age in 2022
The site reliability engineer (SRE) role is still gathering steam across organizations. In January 2022, LinkedIn listed SRE as the 21st job with the highest global demand throughout the past five years. That’s pretty high for such a specific tech role. And, looking to the future, it appears the SRE practice will only continue to […]
Defining Availability, Maintainability and Reliability in SRE
In the world of reliability engineering, you’ll frequently encounter the three “-ability” words: Availability, maintainability and reliability. They sound similar and have similar meanings. In fact, these words may seem so similar that it can be tempting to use them interchangeably. That would be a mistake. Availability, maintainability and reliability all have distinct—if related—meanings, and […]
Where Do SREs Go From Here?
Charlene O’Hanlon talks with Leo Vasiliou, director of product marketing at Catchpoint, about the results of a study the company fielded with VMWare Tanzu and DevOps Institute of nearly 300 site reliability engineers (SREs). This year’s report underscores the challenges of multi-cloud, calls out the underutilization of AIOps and shows a systemic shift in core […]
4 Ways to Overcome Agile Transformation Obstacles
If an organization desires a successful Agile transformation, the first step is to shift its thinking from ‘doing Agile’ to ‘being agile.’ Unfortunately, a commonly shared belief that Agile is just a software development methodology keeps too many companies stuck in the ‘doing Agile’ mindset. This belief ties back to the principles of Agile thinking […]
SREs: Stop Asking Your Product Managers for SLOs
One of the fundamental premises of software reliability engineering is that you should base your reliability goals—i.e., your service level objectives (SLOs)—on the level of service that keeps your customers happy. The problem is, defining what makes your customers happy requires communication between software reliability engineers (SREs) and product managers (PMs) (aka business stakeholders), and […]
Why It’s Time for Site Reliability Engineering to Shift Left
By adopting a multilevel approach to site reliability engineering and arming your team with the right tools, you can unleash benefits that impact the entire service-delivery continuum In today’s application-driven economy, the infrastructure supporting business-critical applications has never been more important. In response, many companies are recruiting site reliability engineering (SRE) specialists to help them […]








