September 2023
In 2003, the Human Genome Project identified that 99.9% of human DNA is identical between individuals. This enabled researchers to utilise a reference human genome, which helped build knowledge and understanding of human diseases. This led to the start of precision or personalised medicine. However, in the 0.1% variation between individuals lies the explanation of how one person can be more susceptible to a certain disease than another and also how one person can respond differently to a specific drug or treatment.
Therefore, it is important that we understand the genetic variants between individuals to offer more accurate insights and create greater equity in access to therapies and treatments for everyone. This article considers opportunities to close the gaps in health data diversity to benefit patients everywhere, with a specific focus on the role that health data standardisation can play.
Since the sequencing of the first human genome, the majority of genetic association with disease studies have been performed in people with European ancestry. This presents the problem where these datasets are only representative of one group of people and are lacking representation of other populations. Only having data from one population group in research studies can inhibit building correct models and forming complete insights.
This then has the potential to lead to biased conclusions and may impact data-driven decision making and processes. Furthermore, having incomplete data to study can help drive and perpetuate biased beliefs and health inequality. This has many negative impacts and can result in:
The good news is that experts, organisations, companies and research groups around the world are driving change to champion diverse and inclusive health data for research. So much so, that funding bodies are making this a requirement in research.
Biotech companies, such as Gen-t in Brazil and Omica.bio in Mexico, are aiming to sequence the Latin American population, which is a historically underrepresented group in genomic and health studies.
Featured resource: Catch up on our data diversity webinar where global experts Prof Lygia V. Pereira, CEO and Co-Founder, gen-t Science, Victor Angel-Mosti, CEO and Founder of Omica.bio, Prof Matt Silver, Genomics England, and Dr Maria Dunford, CEO of Lifebit, discuss the challenges and opportunities surrounding health data equity.
Tackling this issue to close the gap in data diversity is needed especially as artificial intelligence develops. An example is training AI models to better detect skin cancer using diverse skin colours, instead of lighter skin tones. This will ensure a more complete dataset to lead to accurate and unbiased medical insights.
How do we tackle closing the gap in health data diversity?
The first hurdle in closing the gap in health data diversity is to include diverse populations in studies and gather data from these different groups. This is to first identify where data is lacking and then focus recruitment studies on these population groups that are underrepresented. For example, Brazil has one of the most diverse populations in the world, but remains massively underrepresented in genomic studies.
A notable issue within this solution lies in obtaining secure funding from governmental bodies and research organisations to be able to study underrepresented groups.
Standardising health data helps to establish consistent formats and allows for interoperability with other existing data. Having standardised data allows researchers to compare and analyse data from diverse sources and populations.
Data can be standardised to common data models, for example the Observational Medical Outcomes Partnership (OMOP) for standardising health data. Standardised data allows for interoperability as it adopts consistent terminology and protocols for collecting, storing and sharing information.
Having data standardised to a common data model enables researchers to maximise the insights gained from these data and identify trends and disparities in genomic data.
One way to help tackle the lack of diverse data is to provide secure data access globally via a federated analysis approach and use of trusted research environments. This approach enables the safe and secure access, linkage and use of distributed data without it needing to be moved.
A lack in health data diversity can lead to negative outcomes such as disparities in healthcare, misdiagnoses and inadequate treatments. Tackling this and aiming to close this gap will benefit patients everywhere and lead to more accurate insights. This can be achieved through inclusion of more diverse data, standardisation of data for interoperability, and maximising secure data access globally.
Look out for the next blog in our series, where we will describe health data transformation.
Author: Chiara Banas, PhD
Contributors: Hannah Gaimster, PhD and Amanda White
Lifebit provides health data standardisation services for clients, including Genomics England, Boehringer Ingelheim, Flatiron Health and more, to help researchers transform data into discoveries.
Lifebit’s services are making health data usable quickly.