AI, and particularly generative AI, may have been at the forefront of global attention these past few years, but we’re starting to hear growing concerns about an impending AI reckoning as many enterprises aren’t seeing the ROI from their AI investments. Gartner calls this the “trough of disillusionment,” a normal phase that all technologies go through. For now, market observers still think AI spending will continue to grow. According to IDC, the top 1,000 companies in Asia will allocate more than half of their IT spending to AI initiatives by 2025.
But if AI is going to pull through this trough of disillusionment, there is one critical area that needs to be addressed, and that is the underlying IT infrastructure, including data storage. Pure Storage’s recent Innovation Race study found that 80% of global CIOs and decision-makers feel that their companies need to enhance existing infrastructure to effectively support the increasing demands of AI deployments.
Enterprises of all sizes are increasingly recognising the limitations of their existing storage architectures. Many are locked into legacy systems that lack the performance and reliability to support AI workloads. In another Pure Storage study on the impact of AI on IT infrastructure in Singapore’s public sector, half of the agencies surveyed underestimated the demands of AI deployment on data storage. How can enterprises transform their data environments to better meet the demands of AI?
Understanding AI data storage challenges
To understand the challenges that AI presents from a data storage perspective, we need to look at its foundations. Any machine learning capability requires a training data set, but generative AI needs particularly large and complex data sets encompassing different types of data. Generative AI relies on complex models, and the underlying algorithms often include a very large number of parameters that the system has to learn. The greater the number of features, size, and variability of the anticipated output, the greater the data batch size and number of training epochs before inference can begin.
Given the correlation between data volumes and the accuracy of AI platforms, organisations investing in AI will want to build extensive data sets to fully capitalise on AI’s potential. This is achieved through utilising neural networks to identify the patterns and structures within existing data to create new, proprietary content. Because data volumes are increasing exponentially, it’s more important than ever for organisations to utilise the densest, most efficient data storage possible to limit sprawling data centre footprints and the spiralling power and cooling costs that go with them. This also presents another growing concern: the environmental implications of massively scaled-up storage requirements.
Putting the right foundations in place
To enhance the prospects of successful AI implementation, these are the key considerations for organisations:
- Accessibility of GPUs – Supply chains need to be assessed and factored into any AI project from the outset. Access to GPUs is critically important; without GPUs, an AI project is unlikely to succeed. Due to the high demand and resulting scarcity of GPUs on the open market, some organisations planning AI implementations may need to turn to hosting service providers to access this technology.
- Data centre power and space capabilities – AI, along with its massive data sets, creates real challenges for already stretched data centres, particularly in relation to power. Today’s AI implementations can demand power densities of 40 to 50 kilowatts per rack – well beyond the capability of many data centres. AI is also changing the network and power requirements for data centres, including a much higher fibre density and faster networking than what traditional data centre providers can cope with. Power- and space-efficient technologies will be crucial for successfully launching AI projects. Flash-based data storage can help address these issues as it is much more power- and space-efficient than HDD storage and requires less cooling and maintenance. Every watt allocated to storage reduces the number of GPUs that can be powered in the AI cluster.
- Data challenges – Unlike other data-based projects that can be more selective in data sourcing, AI projects utilise huge data sets to train AI models and extract insights to fuel new innovation. This creates significant challenges in understanding how new data affects model outcomes. There is still the ongoing issue of repeatability, and a best practice for effectively managing very large data sets is to use ‘checkpointing.’ This technique allows models to revert to previous states and better understand the impact of data and parameter changes. Additionally, the ethical and provenance issues of using internet-sourced data for training models, as well as the impact of removing specific data from large language models or retrieval-augmented generation data sets, have not been fully explored or addressed.
- Investing in people – Any organisation embarking on an AI journey is going to encounter skills shortages. There simply aren’t enough data scientists or other professionals with relevant skills available in the worldwide workforce at present to cope with demand. Consequently, those with the right skills are hard to find and command premium salaries. This is likely to remain a significant issue for the next 5–10 years. As a result, organisations will need to not only invest heavily in talent through hiring, but also invest in training their existing workforce to develop more AI skills internally.
The road ahead
With businesses in key Asia-Pacific markets standing to gain up to SG$4.1 trillion in economic benefits by 2030 from AI, there is greater pressure to get the groundwork right. A combination of people, processes, and technology can help organisations create an innovation flywheel that drives continuous growth, strengthens competitive advantage, and positions the organisation at the forefront of the AI revolution.