Enabling genomics research with cutting-edge IT

This article is sponsored by Lenovo.

The COVID-19 pandemic has shone a spotlight on the field of health and medical research, with genomics research being one area which has shown great potential to positively impact the lives of millions around the world. With developments in IT, the cost and barriers to entry for such technologies have also decreased, allowing more people to participate in the space and develop new solutions.

At a virtual roundtable organised by Jicara Media and hosted by Lenovo, Ananda Bhattacharjee, Head, High Performance Computing and Artificial Intelligence (Business Development/Solutions Architect) engaged with industry experts and practitioners to explore the challenges that they were facing, and the opportunities that lay ahead for the space.

The cost of genomics

While the first genome took around 13 years and more than US$1 million to sequence, current advancements in technology have brought the price down significantly, where sequencing can be carried out for less than US$1,000 at present. However, participants still shared some of the cost constraints they were dealing with.

For example, a healthcare practitioner shared that whilst the prices had reduced significantly in the research field, the healthcare sector had not seen a similar reduction. Thus, many practitioners were using microarrays as a cheaper alternative to sequencing.

Furthermore, since most of their customers wanted to use more established providers such as Illumina or Novaseq, they were still constrained by licensing agents and distributors in their area of operations. This meant that there was little margin for providers such as themselves.

Data regulations and compliance

The increasing usage of genomics sequencing has created a deluge of data, with sequencers being able to churn out almost 6 TB of data per day. Mr Bhattacharjee shared that whilst Moore’s Law put forward that the number of transistors on a circuit would double every 18 months, in genomics the amount of data was doubling almost every 7 months.

This huge amount of data means strong IT infrastructure needs to be set up to not only house the data, but also comply with the data privacy laws and protections wherever the servers are housed. This problem is especially prevalent for companies with global operations, as they need to ensure that their data flows adhere to the regulations in all the jurisdictions they are operating in.

Whilst cloud computing and processing might provide a short-term solution, it is still ideal for companies to develop on-premises storage to better manage the data and ensure compliance across their operating locations, especially when it comes to sensitive data such as genome sequences and medical records.

Creating user-centric applications

The increasing amounts of data has also led to questions on how best to equip front-end users with the necessary tools and systems to extract actionable insights from the data. While data has made predictions and analysis more robust and accurate, it also has made it more difficult for individuals who might not be familiar with genomics to tap on the data for their work, with many people being intimidated by the large amounts of information available to them. This may be one of the reasons why the current uptake of sequencing in hospitals is still relatively low.

As genomics and sequencing enter mainstream medicine, it is necessary to develop more intuitive and efficient interfaces for users to interact with the data.

For example, one solution discussed was to develop an application-as-a-service solution for medical practitioners to quickly access a database of sequences to find the correct medical solution. This application would include functions to quickly search for particular diseases, and the appropriate solution and drugs to administer in those instances. As doctors are traditionally trained to treat rather than prevent, this service could provide a front-end solution to help them become more familiar with genomics and its applications upstream, and also open the door for genomics to be used in other industries such as insurance.

Integrating AI applications into genomics

With the large amount of data, solutions also need to be found for more effective ways of analysing the data available, especially as the scale of research and projects continue growing exponentially. Many traditional methods of querying the data pose a large challenge from an IT perspective, as large amounts of computation power are required to make sense of the data available.

One such solution would be the use of artificial intelligence (AI) and machine learning together with big data to make predictions that are more accurate and at a lower cost. Participants shared that this is especially for time-consuming processes such as variant calling or predictive analysis. Using AI-driven neutral networks and algorithms could help to significantly reduce the cost, and help increase its mainstream applications.

Rounding up the discussion, Mr Bhattacharjee shared that many of the problems faced were areas that they hoped to address through Lenovo’s Genomics Optimisation and Scalability Tool (or GOAST), which aimed to help democratise high-performance computing and get it into the hands of more people. “The cost of sequencing is going down, and the amount of data is going up,” concluded Mr Bhattacharjee. “This means that our predictions will invariably become much better, and we don’t want people to have to keep waiting to get access to the benefits this technology can bring.”