Ed Keisling sees retrieval-augmented generation hitting its limits not because of model quality, but because of how enterprise data, evaluation, and retrieval pipelines are designed. As organisations move beyond proofs of concept, gaps in relevance, governance, cost, and evaluability surface quickly.
In this interview with Frontier Enterprise, Progress Software’s Chief AI Officer examines how RAG behaves in real enterprise environments, and why production constraints reshape how retrieval, reasoning, and model choice are evaluated.
Where does RAG break down for enterprises?
Traditional retrieval-augmented generation (RAG) systems are starting to show their limits, particularly when enterprises try to combine structured and unstructured data, ensure output quality, and perform meaningful data enrichment.
To address these limitations, approaches such as agentic RAG are emerging. Agentic RAG builds on traditional RAG pipelines by adding reasoning capabilities and introducing greater flexibility in how information is retrieved and reasoned over, with the aim of improving relevance, accuracy, and alignment with the user’s intent.
A critical part of this shift is contextual and secure retrieval. Systems need to dynamically determine what information to retrieve based on a user’s role or profile. This places greater emphasis on security and governance, not only across the underlying data sources, but also across how those sources are integrated into the RAG pipeline.
Transparency and measurable quality also remain major challenges for RAG in enterprise settings. Beyond proofs of concept, organisations need systems that can be deployed in production with confidence, offering traceability, consistent behaviour, and clearly defined quality metrics.
Traditional RAG pipelines often struggle to demonstrate that their outputs meet enterprise requirements. Future RAG implementations will need to prioritise evaluability, transparency, and trust, ensuring that responses are grounded in data and reliable enough to support real-world decisions.
How should enterprises rethink data pipelines for RAG at scale?
RAG, particularly when combined with AI agents in what’s often referred to as agentic RAG, is changing how organisations think about their data and the value embedded within it.
An organisation’s internal knowledge remains one of its most valuable assets. Treating it accordingly requires that data is properly handled, indexed, analysed, and transformed into outputs that deliver insight and practical value.
This shift requires a rethinking of the data pipeline itself. AI agents working alongside RAG can enrich enterprise data by classifying content not only at the document level, but also at the paragraph or sentence level. They can also support the construction of knowledge graphs, which improve retrieval accuracy and lead to higher-quality outputs when information is fed back into models.
AI agents can also contribute to data safety and integrity. They can help detect and filter harmful content, flag malicious prompts, and reduce the risk of jailbreak attempts within the RAG pipeline.
Together, agentic RAG and AI agents give enterprises an opportunity to treat internal knowledge as something dynamic rather than static. This can support new revenue opportunities, productivity gains, and a clearer understanding of how knowledge contributes to organisational effectiveness.
What blind spots exist in how enterprises measure AI reliability?
Many enterprises are currently running proofs of value and proofs of concept without clearly defined use cases, measurable goals, or consistent evaluation methods. This presents a real challenge.
Another blind spot is the lack of clear expectations at the start of these initiatives. Without clearly defined goals, it becomes difficult to assess or interpret the outputs AI systems generate for the organisation.
There are also gaps in how output quality is measured. In many cases, enterprises do not have the tooling or evaluation frameworks needed to objectively assess performance, accuracy, and impact, limiting their ability to measure quality consistently.
What works in a proof-of-concept environment often does not scale easily to production. The journey from POC to enterprise-grade deployment is a different challenge altogether, requiring scalability, governance, and robustness.
How will agentic AI and RAG shape enterprise workflows?
Organisations are entering a period of significant change in how work is structured and executed, with agentic AI and RAG playing a central role.
Agentic approaches, including AI agents and agentic RAG systems, are expected to influence how organisations operate. This includes how decisions are made, how customers are supported, and how data is analysed and acted upon. Rather than incremental change, these approaches are positioned as affecting core business processes.
Beyond agentic AI and RAG, enterprises are expected to begin adopting model context protocols and small language models alongside larger models. Over the next 12 to 18 months, organisations are likely to combine these models into reasoning systems that reflect their decision-making processes, customer requirements, and internal knowledge.
While the direction of change is significant, it also introduces challenges as these technologies are incorporated into enterprise environments.
What trade-offs exist between model performance and retrieval cost?
Balancing model performance against cost involves trade-offs not only in retrieval, but also across generation. Model choice has a direct impact on both cost and output quality.
In an agentic RAG set-up, large language models are typically used across three stages: ingestion, retrieval, and generation. At ingestion, AI agents enrich and prepare data. Using more powerful or expensive models at this stage increases cost, while smaller models can often achieve sufficient accuracy for enrichment tasks at lower expense.
Similar trade-offs apply at generation time. Once retrieved context is provided to the model, its generative capabilities determine the final output. Given the current capabilities of available models, using top-tier models at every step does not always produce proportional improvements, particularly when weighed against cost and token usage.
Across ingestion, retrieval, and generation, enterprises face trade-offs between output quality, latency, token efficiency, and overall cost. Different combinations of models and retrieval strategies can produce different balances across these dimensions, depending on the use case.
Taken together, agentic RAG pipelines highlight that performance gains are not linear with model size or cost. Decisions around model selection and retrieval depth influence efficiency and output quality at each stage of the pipeline.














