UK Biobank has today announced the launch of the world’s most comprehensive study of the proteins circulating in our bodies, which will transform the study of diseases and their treatments. This unparalleled project aspires to measure up to 5,400 proteins in each of 600,000 samples, including those taken from half a million UK Biobank participants and 100,000 second samples taken from these volunteers up to 15 years later. This will allow researchers to explore a first-of-its-kind database, detailing how changes to an individual’s protein levels over mid-to-late life influence disease. The study will begin by analysing the first 300,000 samples, which will include initial samples from 250,000 UK Biobank volunteers and 50,000 second samples taken at follow-up assessments.
Measuring the abundance of thousands of proteins circulating in the blood enables researchers to investigate their potential role in many types of diseases that occur during mid-to-late life. This emerging research field – known as population proteomics – has demonstrated huge potential for diagnostics and therapeutics.
In October 2023, a pilot project released data on nearly 3,000 circulating proteins from 54,000 UK Biobank participants. The pilot was already the world’s largest study of its kind and led to research identifying over 14,000 links between common genetic variants and altered protein levels, over 80% of which were previously unknown.
The research, published in Nature1, has already been cited over 400 times, laying the foundations for scientists to better understand how and why diseases develop. So far, studies using the data have led to advances in disease prediction2,3 and developing future targeted treatments for breast cancer4, cardiovascular disease5, Parkinson’s disease6, and other brain illnesses7.
This new study, which aims to increase this unique dataset by ten-fold, is being funded by a consortium of 14 leading biopharmaceutical companies, known as the UK Biobank Pharma Proteomics Project.
For the first time at this scale, researchers will be able to detect the exact causes of diseases by comparing how protein levels change over mid-to-late life in a large group of people. Proteomic data has already paved the way for better cancer, autoimmune and dementia diagnostics, and this truly exciting study of proteins will significantly speed up drug discovery, leading to major improvements in public health and care everywhere.”
Professor Sir Rory Collins, Principal Investigator and Chief Executive of UK Biobank
UK Biobank’s proteomics dataset will allow researchers to:
-
Examine proteomic and genetic data from half a million people simultaneously. UK Biobank released the whole genome sequencing of its half a million participants in November 2023. Adding proteomic data will allow researchers to combine these massive datasets, providing a more detailed picture of the biological processes involved in disease progression. This may in turn drive the development of personalised treatments.
-
Examine how and why protein levels change over time. Half a million participants provided UK Biobank with a blood sample when they joined and 100,000 of them provided a second sample up to 15 years later. Researchers will be able to see how protein levels have changed over mid-to-late life, enhancing understanding of age-related changes in healthy individuals and shedding light on how diseases develop. This will further accelerate research into diagnostic and prognostic markers.
-
Uniquely use proteomic data in combination with imaging data. Nearly 100,000 UK Biobank participants have undergone magnetic resonance imaging (MRI) of their brain, heart and body, providing researchers with detailed scans. Layering these different data types to investigate human health creates a truly extraordinary, detailed understanding of the disease mechanisms.
-
Open avenues for developing AI models. Already, machine learning tools can predict future disease many years before diagnosis, with the potential to shape early interventions8. The depth and breadth of the proteomic data held within UK Biobank may enable machine learning to accurately subtype diseases, which has the potential to inform what treatments should be given at the point of diagnosis.
Professor Naomi Allen, Chief Scientist of UK Biobank, said:
“Proteomics provides an incredibly detailed snapshot of health. This new frontier of science can unveil how genetics and external factors – like diet, exercise and climate – interact, and will help to pinpoint the key causes of diseases and identify drug targets. It has already led to important scientific discoveries, such as identifying proteins that can help to diagnose disease – including multiple sclerosis9 – and helping to identify those at higher risk of developing dementia10 and cancer 11 many years before clinical diagnosis.
“Over 19,000 researchers around the world are using UK Biobank data; adding proteomic data to everything else we hold will enable scientists to make rapid discoveries to help diagnose and treat life-altering diseases.”
It will take about a year to measure the protein levels in 300,000 participant samples. The proteomic data will be made available to UK Biobank-approved researchers 12 in staggered releases from 2026, with the full dataset expected to be added to the UK Biobank Research Analysis Platform by 2027. During this time, additional funding will be sought to analyse samples from all remaining UK Biobank volunteers (an additional 250,000 participants, including second samples from a further 50,000).
Dr Chris Whelan, Director, Neuroscience, Data Science & Digital Health, Johnson & Johnson Innovative Medicine, Pharma Proteomics Project Lead, said:
“UK Biobank’s proteomic dataset has the potential to enable more powerful biomarker discovery, more accurate disease prediction, and more successful drug development. Analysing samples from two time points in the same volunteer will allow us to examine how protein levels change across hundreds of health and disease states over time, at an unprecedentedly large scale.
“This will represent one of the world’s largest ever biopharmaceutical research collaborations, underlining the growing importance of proteomics as a drug discovery tool. I can’t wait to see how the scientific community will explore these data to pinpoint molecular drivers of disease progression, disease subtypes, and aging.”
Before the data are made available to UK Biobank-approved researchers, and in keeping with its Access policy, members of this industry consortium will have a short period of exclusive access (nine months). Any results gleaned will be returned to UK Biobank, further enhancing a ground-breaking health dataset accessible to approved researchers globally.
The protein detection and sequencing will be completed by Regeneron Genetics Center®, using the Olink™ Explore HT proteomics platform from Thermo Fisher Scientific and Ultima UG 100™ sequencers from Ultima Genomics13, both high throughput technologies enabling large-scale applications.
Read the full article here