The rapid proliferation of AI agents across enterprise environments is reshaping how organisations build and operate software, according to Datadog’s State of AI Engineering 2026 report. Based on telemetry data drawn from thousands of organisations running AI in production, the findings paint a picture of an industry accelerating into complexity—and beginning to encounter the operational limits that come with it.

Two findings stand out. First, the shift toward multi-model strategies is no longer a niche approach; it has become standard practice. Second, AI agents running in production are hitting a hard capacity ceiling, with rate limit errors emerging as the single most common cause of failure.

A multi-model world takes shape

A year ago, OpenAI commanded a 75 per cent share of enterprise LLM usage among Datadog customers. That figure has since fallen to 63 per cent: not because OpenAI lost ground in absolute terms, but because the broader market expanded rapidly around it. The number of Datadog customers using OpenAI more than doubled over the same period, even as Google Gemini and Anthropic Claude gained 20 and 23 percentage points of market share, respectively.

The more telling shift is happening inside organisations themselves. More than 70 per cent now deploy three or more models, and the proportion using more than six models nearly doubled year-on-year. Rather than selecting a single default provider, engineering teams are assembling model portfolios. They are matching lightweight models to extraction and tagging tasks and reserving frontier models for synthesis and reasoning.

This approach offers genuine advantages. Teams can optimise for cost, latency, and output quality at each stage of a workflow. But it introduces significant operational overhead. Coordinating API calls across disparate providers makes it harder to enforce safety and compliance standards consistently and leaves systems more vulnerable when any single provider throttles requests or degrades in performance. The report recommends that teams adopt modular routing mechanisms—such as a gateway service—rather than rely on direct provider API calls scattered across their environments.

Also Read: From fragmentation to shared futures: Re-wiring global digital cooperation from an Asian frontline

The compounding nature of this challenge is also reflected in how organisations manage model versions. Teams are quick to test new releases but slow to retire older models already running in production. Each additional model in the fleet increases evaluation burden and operational risk, a form of AI-specific technical debt that accumulates quietly until it becomes difficult to unwind.

AI agents stall at the capacity ceiling

The second major finding concerns how reliably AI agents perform once deployed. Datadog’s analysis of LLM call failures in customer traces reveals that in February 2026, five per cent of all LLM call spans reported an error with 60 per cent of those errors were caused by exceeded rate limits. The following month, the overall error rate fell to two per cent, but rate limit errors still accounted for nearly a third of failures, totalling approximately 8.4 million incidents in March alone.

The implication is significant. As AI agents take on more complex, multi-step workflows such as orchestrating tool calls, chaining model requests and operating with greater autonomy are running up against the throughput limits of model providers. Reliability, at scale, is becoming a function not just of code quality or prompt engineering, but of infrastructure capacity.

Datadog’s report recommends a combination of operational patterns, including request budgeting and backpressure systems, alongside prompt-level optimisations to reduce unnecessary token consumption.

“AI is starting to look a lot like the early days of cloud,” said Yanbing Li, Chief Product Officer at Datadog.

The parallel is instructive. Cloud computing unlocked enormous capability but demanded an entirely new discipline of operational management. AI agents appear to be following the same trajectory and organisations that invest in observability and reliability infrastructure now may find themselves considerably better positioned as the technology continues to mature.

—

Image Credit: Igor Omilaev on Unsplash

The post Report: AI agents face reliability ceiling as organisations embrace multi-model strategies appeared first on e27.