incident response - Tagged - Page 5 of 7 - staging-devopsy.kinsta.cloud

SRE’s Guide to Pragmatic Incident Response

September 7, 2021 by Bobby Ross Leave a Comment

In my past experience as an SRE, I learned some valuable lessons about how to respond to and learn from incidents. If you want the TL;DR, I’ll summarize them here: Declare and run retros for the small incidents. It’s less stressful, and action items become much more actionable. Decrease the time it takes to analyze an […]

Choosing an Incident Management Platform

August 12, 2021 by Bobby Ross Leave a Comment

When you’re feeling the stress and pain of manually managing incidents and incident response, making the decision to find an incident management tool is a no-brainer. But how do you choose the one that will work best for you, your team and your business? You might be asking yourself, “Where do I start? What do […]

Why You Should Embrace Incidents and Ditch MTTR

July 16, 2021 by John Egan Leave a Comment

The cliché is that everyone in IT hates incidents, and the natural reaction when assembling incident response metrics is to look for numbers that you can lower over time. Fewer incidents and shorter incident response times must be better, we think. You might already be familiar with the common metrics associated with these goals, including […]

Best Practices for Cloud Incident Response

June 29, 2021 by Gilad David Maayan Leave a Comment

Cloud computing is now mainstream, with almost all organizations running at least some resources in the public cloud—whether software-as-a-service (SaaS), platform-as-a-service (PaaS) or infrastructure-as-a-service (IaaS). Security teams have been scrambling to adapt to cloud environments, and with the growing adoption of DevSecOps, they are working together with DevOps teams to secure cloud systems from the […]

Why AIOps Is Critical for Pandemic Business Recovery

June 4, 2021 by Sean McDermott Leave a Comment

As businesses worked to stay afloat over the last year, innovation in many areas fell behind. Business leaders struggled to understand the short-term and long-term impact the COVID-19 pandemic would have on their business, while employees worried about losing their jobs and customers scrambled to rearrange budgets. As we near the end and can finally […]

The Single-Sentence Postmortem

April 7, 2021 by John Egan Leave a Comment

Do you have four seconds? In the time it takes you to read this sentence you could be done with your incident’s postmortem. “But my postmortem has 8 sections! I have to construct a critical timeline, restate my followup tasks, show the major milestones, evaluate root cause…” Stop right there. First of all, your incident […]

Report: The State of DevOps Automation

April 6, 2021 by Bill Doerrfeld Leave a Comment

In the race to accelerate digital transformation initiatives, organizations are encountering more incidents, more downtime, and longer resolution times. In fact, 90.4% of organizations saw an increase in incidents since the pandemic began, according to a recent Transposit report. In a completely digital economy, service downtime takes a higher toll. For ITOps teams, working remotely […]

How to Eliminate Incident Inefficiencies

March 11, 2021 by Anirban Chatterjee Leave a Comment

In today’s complex, dynamic IT environments, the proliferation of disparate IT Ops, NOC, DevOps and SRE teams and tools is a given – and usually considered a necessity. This leads to the inevitable truth that when an incident happens, often the biggest challenge is collaborating between these teams to understand what happened and resolve the […]

XDR: The DevOps Transformation of Security Infrastructure

December 16, 2020 by Gilad David Maayan Leave a Comment

eXtended detection and response (XDR) is a security technology that unites multiple security systems into one. Organizations are transitioning from traditional systems such as endpoint detection and response (EDR) and security and information event management (SIEM) to XDR, in a move that is analogous to the transition from agile to DevOps work processes. XDR can […]

Leading Effective Incident Response Without Interminable Bridge Calls

December 15, 2020 by Isaac Sacolick Leave a Comment

There are easier ways to manage incident response without creating war rooms and packing IT staff onto bridge calls Your phone vibrates at 11 p.m., and you know that can only mean another major incident with one of the business’ critical systems. You get geared up for the war room, dial into the bridge call […]

Sign up for our newsletter!Stay informed on the latest DevOps news

Sign up for our newsletter!
Stay informed on the latest DevOps news