Site reliability engineering (SRE) has become an emergent niche practice invented at Google to become a foundation of contemporary enterprise performance worldwide. With the continued growth of microservices, a multi-cloud infrastructure and continuous deployment pipelines adopted by organizations, the operational surface area has increased to the extent that human personnel cannot monitor and manage it in real time. The effectiveness […]
Grafana Labs Slashes Time to Create Observability Dashboards
At its GrafanaCON 2023 event today, Grafana Labs updated its core platform for visualizing data which makes it simpler to set up dashboards in minutes. The updates are part of a celebration of the 10th anniversary of a tool that is now widely used by DevOps teams. Ryan McKinley, a distinguished engineer for Grafana Labs, […]
What the Convergence of Observability and Security Means for Devs
There is a scene in the movie Apollo 13 when the mission control flight director asks why the carbon dioxide scrubber in the command module was a different shape than the one used in the lunar module (and therefore incompatible). The engineer simply replied, “This just isn’t a contingency we’ve even remotely looked at.” A […]
Datadog Dives Into Universal Service Monitoring
Datadog, Inc. today made generally available a Universal Service Monitoring service that takes advantage of the extended Berkeley Packet Filtering (eBPF) microkernel in a Linux operating system to automatically detect all the services that make up an application environment without changes to the code used to construct them. Yrieix Garnier, vice president of product at […]
Switching From FluentD to Vector Log Aggregation Tool
Log files are extremely important to the data analysis process as they contain essential information about usage patterns, activities and operations within an operating system, application, server or device. This data is relevant to a number of use cases across an organization from resource management, application troubleshooting, regulation compliance and SIEM and business analytics and […]
Fastly Adds Observability Tools for Edge Computing Environments
Fastly, Inc. this week added a set of observability tools to its portfolio that are optimized for edge computing platforms connected to its content delivery network (CDN). Laura Thomson, senior vice president of engineering at Fastly, said the company is now providing a set of tools that enable DevOps teams to capture logging data and […]
Cribl Simplifies Telemetry Data Analytics
Cribl today announced general availability of a search capability that makes it simpler to query observability data where it resides without having to first collect and centrally store it. Nick Heudecker, senior director for market strategy for Cribl, said Cribl Search makes it possible to analyze telemetry data at its point of origin or when […]
Mezmo Adds Observability Pipeline to Analyze DevOps Data
At the KubeCon + CloudNativeCon North America conference this week, Mezmo launched an Observability Pipeline platform that promises to make it simpler to manage, enrich and correlate machine data. Previously known as LogDNA, Mezmo is expanding the scope of its reach to add the ability to augment and analyze data on top of its core […]
Of Max and Min: The Non-Interference Prime Directive (for Visibility)
Last issue, I cited CPU utilization as an example of a metric that is often misused to describe/explain/infer system performance and asserted that improved visibility can help to overcome such misuse. In this issue, I will expand on what improved visibility means and present two case studies that illustrate a recent antipattern that I’ve noticed […]
Datadog Extends Reach of Integrated DevOps Platform
At its Dash 2022 conference, Datadog announced today it is extending the reach of its namesake cloud-delivered monitoring and observability platform to address continuous testing, application security and cost management. In addition, Datadog has made available in beta a Data Streams Monitoring tool that makes it simpler to identify upstream issues that are likely to […]










