Datadog CEO: Why observability matters in agentic AI

Gaining observability means focusing on what matters — cutting through complexity to see AI systems with clarity. Image courtesy of Edryc James P Binoya.

AI agents are transforming how enterprises build and deliver software, yet a critical aspect still remains on the sidelines: observability. If software makers do not know what their AI agents are doing, how can they expect customers to trust them?

At the DASH 2025 conference in New York City, Datadog announced new agentic AI monitoring and experimentation capabilities. These aim to help teams track agent behaviour, run structured tests, and maintain central oversight of both in-house and third-party AI agents. In this first of a two-part feature, Datadog Co-Founder and Chief Executive Officer Olivier Pomel outlines the company’s AI strategy, with a focus on embedding observability into AI infrastructure.

AI revolution

During a media conference, Pomel identified four ways the AI boom intersects with Datadog’s business.

First, as organisations deploy GPUs and other AI-specific hardware, a new class of cloud infrastructure is emerging.

Second, Datadog is working with companies building large language models (LLMs).

“All of a sudden, they have applications that are no longer deterministic, and they need to understand what the models are doing — whether they’re doing the right thing, being safe, and being productive,” he said.

Third, more companies are now generating code using AI.

“Back then, companies were using copilot to speed up coding a little, but they weren’t really generating code with AI. Today, they mostly are, and that raises a whole new set of issues: validating the code, making sure it’s safe and functional, and understanding how humans interact with it,” Pomel explained.

Lastly, AI has enabled Datadog to automate tasks that were previously manual.

“Instead of staying in reactive mode — telling customers there’s an issue, waking them up, and making them fix it — we’re increasingly able to handle much of the fixing ourselves,” he noted.

Culture of trust

To help enterprises get more value from their AI investments, Datadog introduced three new capabilities within its LLM Observability solution. These are designed to monitor agentic systems, run structured LLM experiments, and assess usage patterns and the impact of both custom and third-party agents.

Olivier Pomel, Co-Founder and Chief Executive Officer, Datadog. Image courtesy of Datadog.

The first, AI Agent Monitoring, maps each agent’s decision path — including inputs, tool invocations, calls to other agents, and outputs — in an interactive graph. According to the company, this helps engineers investigate latency spikes, incorrect tool calls, or unexpected behaviours such as infinite loops. These can then be correlated with quality, security, and cost metrics to simplify the debugging of complex, distributed, and non-deterministic agent systems.

The second, LLM Experiments, tests and validates the effect of prompt changes, model swaps, or application updates on LLM performance. The tool compares experiments using datasets from real production traces (input/output pairs) or those uploaded by customers. This allows teams to quantify improvements in response accuracy, throughput, and cost, while guarding against regressions.

The third, AI Agents Console, provides organisations with a central view of both in-house and third-party agent behaviour. It enables teams to track usage, measure impact and ROI, and proactively check for potential security and compliance risks. AI Agents Console launched in preview at DASH 2025.

“Our role is to make sure we integrate with everything our customers use, whether it’s open source or not, and to understand where our differentiation lies and where it doesn’t, both historically and today,” Pomel said.

Domain-specific

Building on its Bits AI generative assistant, Datadog introduced three domain-specific agents at DASH 2025: Bits AI SRE, Bits AI Dev Agent, and Bits AI Security Analyst.

The first, Bits AI SRE, is a 24/7 on-call responder currently in limited availability. For every alert, it performs early triage using telemetry and service context to surface initial investigation findings before responders log in. It assigns owners, aligns stakeholders through real-time incident summaries and updates, and suggests next steps. It also generates a first draft of the post-mortem to save responders time.

The second, Bits AI Dev Agent, currently in preview, detects issues, generates code fixes, and opens pull requests tailored to the organisation’s technology stack. This allows users to quickly review and merge changes directly within their supply chain management systems.

The third, Bits AI Security Analyst, also in preview, autonomously triages Cloud SIEM signals, investigates potential threats, and provides resolution recommendations, all without human prompting.

“We developed these through constant push and pull. We have one big platform that does everything, and several specialised agents that solve specific problems,” Pomel said. “The reason is that we need to innovate faster and in different directions to cover the different problems.”

Open source

Around 15 years ago, when observability was still referred to as “monitoring,” open source played a vital role — a role it continues to hold today. Pomel, one of the original authors of the VLC media player, acknowledged its enduring importance to the developer ecosystem.

“Everything we ship on our customers’ infrastructure — our agents, libraries, collectors, and providers — is open source,” he said.

When it comes to AI, Pomel said, open source takes on even greater significance.

“It’s becoming more important for many companies to have control over what their AI models are doing. The fact that there’s a vibrant ecosystem for open-source models is a good thing, and that’s also why we decided to open source our observability model. We think it benefits the community,” he said.