National Biobank of Thailand revs up DNA finds with NVIDIA

The National Biobank of Thailand (NBT) has deployed an NVIDIA DGX A100 system and NVIDIA Clara Parabricks sequencing analysis software to accelerate genomic sequencing as part of the government’s plan to promote genomic medicine in Thailand. 

In 2019, the research institution was entrusted to design and implement the IT infrastructure for a national project called Genomics Thailand (GeTH), aimed at introducing genomic medicine as a common medical service. 

One of the GeTH’s flagship projects involves extracting individual’s genetic variations from 50,000 Thai volunteers’ whole genome sequencing (GeTH50K) data. This work will provide a new collection of variants that is better to represent the Thai population than those from publicly available databases. 

GeTH 50K database harbours variants distributed across the entire human genome which are extremely useful in population genetics. Those rare variants from the database may have some medical importance. 

The sequence data for an individual’s genome contains more than 100GB that must be sequentially aligned to a human genome reference to identify potential variants of the individual. This process results in an extra 100GB in total of over 200GB per sample. The parallel processing power of GPUs dramatically accelerates the entire process. 

Identification of variant, called variant calling is the required process in genomics medicine. Accurate and rapid processing of whole genome sequencing (WGS) data makes it possible for patients to be treated with precise and personalised care, improving quality of life by reducing hospital visits and the associated costs. 

Using NVIDIA DGX A100, the universal system for AI workloads, NBT inegrates eight NVIDIA A100 Tensor Core GPU accelerators delivering five petaflops of AI performance for researchers. This compute density, performance and flexibility enables NBT to consolidate training, inference, and analytics into a unified, easy-to-deploy AI infrastructure. 

Additionally, NBT uses the NVIDIA Clara Parabricks computational pipelines, which support several genomics applications. Using NVIDIA’s CUDA, HPC, AI and data analytics stacks, Clara Parabricks Pipelines empower researchers to build GPU accelerated libraries, pipelines, and reference application workflows for primary, secondary and tertiary analysis.

“By pairing NVIDIA DGX A100 with NVIDIA Clara Parabricks, we have been able to reduce our WGS data processing by four months,” said Sissades Tongsima, director of NBT. “Processing time per individual user has also been shortened from more than 30 hours to just one to two hours.”