Petabytes of data – how informatics is transforming precision medicine
Posted: 26 March 2020 | Nikki Withers (Drug Target Review) | No comments yet
Advances in informatics have afforded researchers the ability to extrapolate petabytes of human genomics data and translate it into biologically relevant information. However, further translating this information into knowledge can prove challenging. Slavé Petrovski, Vice President and Head of Genome Analytics and Bioinformatics for AstraZeneca’s Centre for Genomics Research, spoke to Nikki Withers about how informatics has positively impacted precision medicine and genomics research.
Why is precision medicine so important for drug discovery?
The overarching goal of precision medicine is to transform patients’ lives by personalising their treatment. This can be achieved by identifying the underlying molecular cause or biomarkers of disease in individual patients. By knowing this, we aim to match medicines to those patients who are most likely to benefit from that specific treatment.
If you look at our research pipeline, approximately 90 percent follows a precision medicine approach compared to about 10 percent back in 2009. This includes a broad range of cutting-edge technologies for both wet lab and informatics, tumour tissue diagnostics, molecular tests and point-of-care diagnostics, which are allowing information to be available to the physician at the point of interaction with the patient.
How has informatics transformed genomics research?
Something we have been looking at recently is the use of sophisticated analytical frameworks on top of these data to ask further questions and tease out answers. For example: why does it matter that genetic variation is present in that individual? Does it cause disease? Does it change the way they respond to treatment? At AstraZeneca, we have a cloud-based informatics pipeline workflow, which processes all the genomes from our genomics initiative – it is our ambition to analyse up to two million genomes by 2026. This is optimised to the point where we can now complete the end-to-end analysis of approximately 1,300 sequences in an hour. To put that into context, that is a 10-fold increase in efficiency from 2017 and this is driven by the optimisation of our informatics pipeline in the cloud.
How is informatics aiding advances in precision medicine?
Every one of us has approximately three billion bases in our genome”
Every one of us has approximately three billion bases in our genome; that is three billion data points to study. When you span that across two million individuals, you can appreciate how much data that is, and the reason informatics has become increasingly important. For example, patients in selected clinical trials who have consented to genetic analysis may have their data linked to their clinical outcomes. This allows us to study how variations in their three billion bases correlates with how they respond to or tolerate a treatment, and whether they were the right patient population for that medicine given the underlying cause of disease. By integrating these anonymised genomic and clinical data from the hundreds or thousands of participants in our clinical trial programmes, we are aiming to identify the actual genetic profiles that can predict disease progression and response to treatment.
What challenges does informatics present to researchers?
The main challenge is extrapolating the maximum amount of biological insight from the vast amount of data we are generating; we must address how we can translate petabytes of genomics data into biologically relevant information. Translating that information into knowledge is the next step in the process and is an area we are on the journey of, using AI and machine learning.
What are your thoughts on collaboration in this area of research?
For this collaboration, it was clear to all the individual partners in this pre-competitive consortium that the costs prohibited us from doing it alone. The obvious conclusion was to work together to generate these data. This paradigm shift from, “This is my silo of data,” to, “Let us build an immense medical research resource that we could all – industry and academia – benefit from,” had to happen, otherwise we would be limited in terms of progress.
What developments do you expect to see in the next five years?
It is very exciting to see how recent progress in informatics and technology enables large genomic studies to be conducted at scale. I could not have imagined analysing the exomes of 300,000 individuals in any bioinformatics environment five years ago. Those capabilities did not exist; partly because we did not have that scale of genomics data so there was no need to push the boundaries of technology. Like we have seen in other fields, often it is the data that instigates the need to build up innovative IT architecture.
Moving to analytics, we now have access to hundreds, thousands and, before we know it, millions of genomes. I am excited to see what we can extract from these; from studying individual variants with large effects on clinical outcomes to looking at combinations of variants to polygenic risk scores – all with the aim of getting the right treatment to the right patient at the right time.
Related topics
Artificial Intelligence, Biomarkers, Genomics, Informatics, Personalised Medicine, Sequencing
Related organisations
AstraZeneca, UK Biobank
Related people
Slavé Petrovski (AstraZeneca)