Tag: AI observability
Scaling AI the Right Way: Platform Patterns for Performance and Reliability
AI performance breaks long before the model runs. Learn how ingestion speed, elastic training, low-latency inference, observability and automation create reliable, scalable AI systems ...
Three Strategies for Winning the AI Race With DevOps
AI is transforming DevOps. Learn how faster model training, optimized pipelines and smarter GPU infrastructure help teams deliver reliable, scalable AI workflows ...
AI Agent Performance Testing in the DevOps Pipeline: Orchestrating Load, Latency and Token Level Monitoring
Traditional testing misses token and context failures. Discover how to measure, test and scale AI agents reliably in production ...
MCP — A Protocol for SREs
The Model Context Protocol (MCP) standardizes how AI agents access tools, APIs and data. Learn how SREs can leverage MCP to build smarter, automated workflows ...
SRE in the Age of AI: What Reliability Looks Like When Systems Learn
As AI and ML become core production components, SRE is evolving from managing deterministic systems to ensuring the reliability of dynamic, learning systems. New metrics, workflows, guardrails and cross-disciplinary practices are redefining ...
New Relic Enhances Azure Integration with AI-Powered Observability Tools
New Relic Inc. unveiled a suite of intelligent observability integrations with Microsoft Azure on Tuesday to streamline incident response and boost developer productivity as enterprises rush to adopt artificial intelligence (AI) workflows ...
AI-Driven Performance Testing: A New Era for Software Quality
Discover how AI and large language models (LLMs) are revolutionizing performance testing—shifting from reactive load testing to predictive, continuous assurance powered by intelligent agents and automation ...
The Future of Observability: Predictive Root Cause Analysis Using AI
In the past few years, systems have become more complex than ever. Microservices, Kubernetes, cloud environments and distributed application programming interfaces (APIs) have changed how we build and manage software. However, this complexity has also made it harder ...
Observe Adds Two AI Agents to Improve Observability
Observe Inc. introduces the AI SRE Agent and o11y.ai Agent to its observability platform—empowering DevOps teams to automate incident triage, generate OpenTelemetry code, and query application performance using natural language for faster, ...
OpenTelemetry and AI are Unlocking Logs as the Essential Signal for “Why”
Logs reveal the “why” behind failures. Learn how OpenTelemetry and AI transform raw log data into structured, actionable insights for modern observability ...
The Agentic AI-Driven Future of Telemetry
Telemetry is evolving from passive data to AI-grade fuel. Learn how agentic telemetry fuses human and machine context to power self-healing, intelligent systems ...
The Breakneck Future of Codegen: Why AI SWE Must Be Matched with AI SRE
AI codegen is transforming software development — but as speed and complexity increase, so does fragility. AI for site reliability will need to keep pace to avoid system breakdown and engineer burnout. ...

