
At a lunch forum in Singapore hosted by Firebolt, one message came through clearly. As enterprises move from AI pilots to production, the limiting factor is no longer models or algorithms, but the data infrastructure beneath them. AI workloads are fundamentally changing how data is accessed, queried, and consumed, exposing weaknesses in architectures built for human-driven analytics rather than machine-driven demand.
Firebolt was created precisely because its founders anticipated this inflection point, as Sandeep Mathur, Managing Director, APAC, Firebolt, explained during the forum, titled “Data Infrastructure in the Age of AI.” As Mathur put it, the company’s mission has always centred on “future-proofing your data strategy and explaining why Firebolt needs to exist as a company.”
AI workloads change who queries data
The reason, Mathur argued, is straightforward. AI changes who or what is querying the data. “Instead of users running these queries, we now have agents running queries,” he said. Unlike humans, AI agents do not pause between questions or carefully optimise SQL. They issue many queries at once, often exploring multiple dimensions of the same business question. “They will likely issue 10, 15, or even 30 queries simultaneously, creating immense pressure on existing infrastructure.”
This shift has immediate consequences for compute consumption. A seemingly simple natural language prompt, such as asking how the business is performing today, can trigger a large language model to generate dozens or even hundreds of back-end queries. “As a result, compute requirements increase dramatically,” Mathur said, adding that customers running early AI pilots have already observed this effect.
Why existing data architectures fall short
For many organisations, the instinctive response has been to optimise more aggressively within their existing technology stacks. Mathur contends that this often treats symptoms rather than root causes. “Legacy infrastructure unfortunately handcuffs innovation,” he observed. “Engineers don’t want to spend time on these problems,” he said. “They want to work on product innovation and ship new features, not worry about all the infrastructure pieces they are being forced to manage today.”
Mathur also emphasised that the challenge is not that data warehouses, lakehouses, or query accelerators are fundamentally flawed. Rather, he said, none were designed for the concurrency and latency profiles that AI workloads demand. Traditional data warehouses excel at batch processing. Lakehouses add flexibility but introduce operational complexity. Query accelerators can help in specific scenarios but struggle at petabyte scale. The result is a fragmented architecture that still fails to meet real-time AI expectations.
Firebolt’s architectural response
“What’s really needed is a query engine optimised for high concurrency and low latency, exactly what AI workloads demand, without the complexity and cost of traditional enterprise systems,” Mathur said. That insight underpins Firebolt’s architecture, which aims to combine fast query acceleration with the scale and governance capabilities expected of an enterprise data warehouse.
That architectural philosophy is reflected in Firebolt’s engineering decisions. Pascal Schulze, Software Engineer, Firebolt, described how performance is approached at a systems level. “Firebolt focuses on low latency and high concurrency,” he said. “As engineers, we work every day to make that vision real.”
Engineering for concurrency and resilience
At the core of Firebolt is a fully distributed, vectorised query execution engine that uses all available resources, whether operating on a single node or across a multi-node cluster. The system is also designed to remain resilient under pressure. “We support disk spilling when memory is insufficient, ensuring queries don’t fail,” Schulze explained.
Schulze said Firebolt’s optimisation philosophy is guided by a simple principle: “The fastest operation is no operation at all.” Rather than relying purely on raw compute, the system reduces work through early pruning, indexing, and subquery caching. “We aggressively prune data early using indexing and subquery caching,” he said, noting that unnecessary interaction with cloud object storage, often a hidden source of latency, is avoided wherever possible.
The performance impact can be significant. By reusing intermediate results and hash tables built for join operations, Firebolt’s subquery caching can deliver “5 to 10x performance gains, or even 100 to 1000x when result caching applies.” In production environments, this has translated into an average tenfold reduction in join-processing time across workloads.
Scaling for an agentic AI future
These optimisations extend to modern open table formats such as Iceberg. Whether Firebolt is deployed as a full data warehouse or used as a query accelerator on top of external Iceberg tables, Schulze said “performance remains consistent.”
Scalability, Mathur argued, is not simply about adding more resources. It is about giving customers control over price-performance trade-offs. Firebolt supports flexible scaling across node sizes, engine types, and cluster counts, enabling teams to align infrastructure closely with workload behaviour rather than over-provisioning defensively.
These characteristics become especially important in what Mathur described as an agentic AI future. “In an agentic AI world, queries are no longer written by humans but generated by LLMs that issue concurrent requests and analyse results in milliseconds,” Schulze said. In that environment, predictability, latency, and concurrency become foundational requirements for AI workloads.
Mathur concluded that optimising data infrastructure for AI is ultimately about enabling innovation without locking organisations into an ongoing struggle with cost and complexity. By rethinking query execution, offering fine-grained control, and designing for machine-driven access patterns, Firebolt positions itself as infrastructure built not only for today’s AI workloads, but for those still emerging.
As Mathur summarised it, the goal is to help teams “ship AI features very, very quickly” while retaining control over performance and cost. In a world where AI agents do not pause and data questions never stop, that balance may become a decisive advantage.










