Datadog at its DASH 2024 conference added a bevy of tools and capabilities to streamline DevSecOps workflows, including integration with open-source OpenTelemetry agent software developed under the auspices of the Cloud Native Computing Foundation (CNCF) and the Datadog On-Call tool to optimize incident management workflows in a way that maintains context with observability data already collected.
In addition, support for agentless scanning along with tools for discovering vulnerabilities and sensitive data have been added along with a Log Workspaces collaboration tool that enables analysts and engineers to employ natural language interfaces to invoke a generative artificial intelligence (AI) agent to more easily associate logs and other datasets to specific applications, and a Live Debugger tool that enables developers to use live production data to better troubleshoot applications.
Finally, the company added a Kubernetes Autoscaling capability to provide IT teams with more control over how cloud-native applications dynamically scale up and down.
Hugo Kaczmarek, director of product for Datadog, said as observability continues to evolve it will become easier for IT teams to triangulate the root cause of issues down to specific lines of code. It may never be possible to prevent every error but the meantime to remediation will continue to rapidly decline as further advances in artificial intelligence (AI) are made by, for example, using visualization tools to replay the execution flow of code. IT teams will also be able to leverage generative AI to reproduce issues using production data.
Overall, Datadog is continuing to expand the reach of its observability platform in a way that promises to enable organizations to reduce the number of tools they would otherwise need to acquire, manage and integrate.
At the same time, integrations of with open source tools such as OpenTelemetry will simultaneously make it less expensive to instrument IT environments. Datadog is also making it simpler to centrally manage all the instances of OpenTelemetry that might ultimately be deployed.
The extensions to the Datadog platform, mostly available in beta, are being made at a time when many organizations are starting to embrace platform engineering as a methodology for managing DevOps workflows at scale. The challenge many of them will initially face is determining exactly how many platforms may be required to unify the management of those workflows.
It’s too early to say how platform engineering will evolve but with the rise of AI tools to generate code, observability is going to increasingly become a requirement. The issue is many DevOps teams don’t always know what queries to launch to surface issues. However, with the aid of AI agents the platform itself will not only suggest queries but also run them automatically as needed. The goal is to augment existing DevOps engineering teams to enable them to manage applications at much higher levels of scale without necessarily increasing headcount.
In the meantime, however, it’s also apparent that many DevOps teams are already struggling to manage the workloads already deployed. In that regard AI tools that promise to reduce existing toil can’t arrive too soon.