ServiceNow added an incident management platform based on the Lightstep observability platform it acquired last year to its software-as-a-service (SaaS) portfolio.
Ben Sigelman, general manager of Lightstep at ServiceNow, said Lightstep Incident Response gives DevOps teams access to observability tools that, via a self-service portal, enable them to determine the root cause of any incident much faster. It combines the ability to orchestrate on-call escalation, alert grouping, incident analysis and remediation with a set of collaboration and incident management tools.
That approach will make the Lightstep observability platform much more accessible via a cloud service, noted Sigelman. The Lightstep observability platform itself is based on a time-series database capable of processing one trillion events each day.
Lightstep Incident Response manages an organization’s on-call rotations by synchronizing schedules via a shared calendar, with specific tags that indicate who needs to be looped in based on the nature of the incident and the service that is impacted. Team members are invited to a dedicated channel based on prebuilt collaboration integrations for quick remediation. Over time they can create automated processes that self-triage and self-remediate problems as they reoccur. The platform also integrates with monitoring, observability and collaboration tools such as LogicMonitor, Postman, Slack, Sumo Logic and Zoom.
Lightstep Incident Response is generally available now in both free and paid versions. Pricing is based on the use of active services being managed rather than by seat license.
Lightstep Incident Response also natively integrates with the Now Platform that ServiceNow created for IT service management (ITSM). That integration will enable organizations to more easily meld ITSM and DevOps workflows, noted Sigelman.
Observability platforms aggregate the collection of logs, metrics and traces in a way that makes it possible for DevOps teams to query that data. The goal is to make it easier for DevOps teams to identify anomalies that might disrupt an application environment. Those same tools, however, also allow DevOps teams to identify the root cause of an IT incident much faster. Today many organizations routinely convene “war rooms” that require IT teams to painstakingly identify the root cause of an issue by a process of elimination; this can take days—sometimes even weeks—to complete.
With more applications becoming instrumented—thanks, in part, to the availability of open source agent software—Sigelman said observability platforms will play a critical role in helping to automate incident management processes. Prior to being acquired by ServiceNow, Lightstep team members played a role in the creation of both the OpenTracing and OpenTelemetry open source projects for collecting traces, metrics and logs. Today, OpenTelemetry is a sandbox-level project being managed under the auspices of the Cloud Native Computing Foundation (CNCF).
With the deployment of more microservices-based applications, the percentage of applications that are instrumented will soon increase substantially. It’s not feasible to manage these highly distributed applications without some level of instrumentation. The issue now, of course, is determining to what degree to also instrument many of the monolithic applications that are already running in those same environments.