The GPU wars and implications for the enterprise

Photo by Mukil Menon

The graphics processing unit (GPU) is more than just a PC gaming component. Majority of the world’s top supercomputers are powered by GPUs. Enterprises, in particular, benefit from using GPUs in disciplines such as deep learning and machine learning, analytics, computational finance, manufacturing, construction, business process optimisation, and more. This broadening of GPU use can be attributed to several factors such as:

  • The exponential growth of data. According to market data firm Statista, the total amount of data created and consumed in 2020 reached 64.2 zettabytes, and within the next five years is expected to grow to over 180 zettabytes.
  • Moore’s Law (an observation that the number of transistors in an integrated circuit would double every two years) has dramatically slowed down, with processor core performance now expected to double every 20 years.
  • The growth of GPU performance. Bill Dally, Chief Scientist and Senior Vice President of Research at Nvidia, has observed that Nvidia’s chips have increased performance  “317 times for an important class of AI calculations.”

A new challenger appears

The GPU space currently has two main contenders: Nvidia and Advanced Micro Devices (AMD). Intel is also considered part of the market because it is the leading supplier of CPUs, many of which come with integrated GPUs. However, Intel has yet to directly participate in the GPU war between the two other graphics chipmakers.

Nvidia emerged in the 1990s as one of the competitors in PC graphics acceleration versus ATI and the now-defunct 3DFX. AMD, on the other hand, entered the market in 2006 when it acquired video card maker ATI Technologies. Between the two rivals, Nvidia focuses on graphics solutions while AMD competes in the microprocessor space as well versus Intel.

Since AMD’s entry in the GPU market, the game of one-upmanship between Nvidia and AMD has heated up, which led to better products and typically more options for consumers. Most recently, Nvidia introduced deep learning super sampling (DLSS), a technology that uses deep learning to upscale lower-resolution images while maintaining or even improving frame rates. DLSS is, however, exclusive to Nvidia’s RTX GPUs. In response, AMD unveiled FidelityFX Super Resolution (FSR), an upscaling technology, at Computex 2021. The notable differences between the two are that FSR is open source (which means it is free for developers to use), and supports several of its competitors’ GPUs.

Today, Nvidia and AMD are the only discrete GPU manufacturers in the market. AMD holds a smaller market share, which is similar to their standing in the CPU market opposite Intel. For the overall GPU market as of Q4 2020 – which includes all PC GPUs, including all integrated solutions – Intel has a market share of 69%, while Nvidia held 17% and AMD’s share is at 15%. For add-in-board GPUs, Nvidia holds a market share of 83% in contrast to AMD’s 17%. When it comes to desktop discrete GPUs (standalone GPUs with its own dedicated memory), Nvidia holds 82% and AMD has 18%.

Expanding into the enterprise

Nvidia’s larger segment in the GPU market has led to more resources, which enable them to diversify outside PC gaming, in areas like GPU-accelerated computing in enterprises, including:

  • Grace, an Arm-based data centre CPU designed for applications such as natural language processing, recommender systems, and AI supercomputing.
  • The DGX SuperPod, a cloud-native, multi-tenant AI supercomputer. It includes Nvidia Base Command, a solution that coordinates AI operation for teams of data scientists and developers located around the globe.
  • Its collaboration with pharmaceutical firm AstraZeneca and the University of Florida on AI research projects using transformer neural networks.
  • A new classification of Nvidia Certified Systems, which lets customers using the Nvidia AI Enterprise (a suite of AI and analytics software) and VMware’s vSphere 7 (a widely used compute virtualization platform) to run virtualised AI applications on industry-standard servers.
  • The Nvidia A30 and A10 GPUs for enterprise servers. The A30 is designed for mainstream AI and data analytics, while the A10 is for AI-enabled graphics, virtual workstations, and mixed compute and graphics workloads.

For its part, AMD has ventured into the enterprise space with several endeavours such as:

  • The Radeon Pro line, which is designed for workstations that run computer-generated imagery, computer-aided design, and high-performance computing.
  • The enterprise-grade Epyc x86-64 processors, which target the server and embedded system markets. Epyc processors feature higher core counts, more PCI Express lanes, a larger cache memory, and support for higher amounts of RAM.
  • Several generations of the Instinct GPU, including the MI25 in 2017, the MI50 in 2018, and the MI100 in 2020. The AMD Instinct is a brand of GPUs designed for deep learning, artificial neural network, and high-performance computing applications.
  • The ROCm, a software development platform for GPU-accelerated computing. Since it is an open platform, developers can use it to code for different environments, even Nvidia GPUs.

The rise of the IPU?

Intel, which leads the data centre market, recently revealed its infrastructure processing unit (IPU) plans – a sideways entry into the enterprise GPU wars. 

An IPU is a programmable networking device designed to enable cloud and communication service providers to reduce overhead and free up performance for CPUs. The intention is that the IPU will better utilise resources and balance processing and storage, and is designed to “address the complexity and inefficiencies in the modern data center,” according to Guido Appenzeller, CTO, Data Platforms Group at Intel. 

While this doesn’t directly address pure data-crunching power, it seeks to offload some of the resource-heavy compute overhead from the CPUs. The IPU has dedicated functionality to accelerate modern applications that are built using a microservice-based architecture in the data center. Why is it important? Research from Google and Facebook has shown 22% to 80% of CPU cycles can be consumed by microservices communication overhead. 

The first of Intel’s FPGA-based IPU platforms are deployed at “multiple cloud service providers” and its first ASIC IPU is under test. Intel will roll out additional FPGA-based IPU platforms and dedicated ASICs. 

The global chip shortage

Just as GPUs have started gaining greater relevance for enterprises, the market is currently facing a global chip crisis in which the demand for semiconductor chips exceeds the supply. There are three causes attributed to the crisis:

  • COVID-19
    Because of the pandemic, people had to work from home and avoid venturing outside. This led to a higher demand for and sales of consumer devices like computers for work, and gaming consoles for entertainment.
  • The United States-China trade conflict
    The US government’s restrictions on Semiconductor Manufacturing International – China’s largest chip manufacturer – has made it more difficult for them to sell to companies with American ties. This forced companies to work with other manufacturers like Taiwan’s TSMC or South Korea’s Samsung, which were already at full capacity.
  • The Taiwan drought
    This year, Taiwan is facing a massive drought, making it difficult for chip manufacturers there because they use ultrapure water for cleaning their wafers and factories.

So in addition to the GPU market, the chip shortage has impacted over 169 industries, including automotive and home appliance companies. This shortage has resulted in inflated prices for gaming consoles and GPUs, among other products. During an April investors call, Nvidia warned that the GPU shortage will continue as “overall demand remains very strong and continues to exceed supply”, while their supplies “remain quite lean.”

What does the chip crisis mean for the enterprise? This could mean some scarcity in certain products like Chromebooks, PCs, and some high-end laptops and workstations. These devices will remain available but perhaps not in the same quantities as before, or even at discounted prices commonly offered to companies. Thus, procurement of computers may be affected.

Public clouds will likely remain the same as providers have considerable headroom and can procure computers and components more easily than the average consumer. However, the shortage may affect custom enterprise solutions, probably not in quantity, but possibly in price or even availability.

The chip shortage has certainly given new meaning to the term “GPU wars”. Planning for GPU purchases has certainly become a necessity these days, whether for consumers or enterprises, but the competition has extended to the customers themselves.