Data Transformation
Unlock the value of your large-scale health data
with Lifebit’s Data Transformation Suite
The challenge
Data for health research comes from a wide range of sources
With this diversity comes wide variability in how data is described and stored:
- Formats can vary widely
- Fields are named differently between datasets
- Different medical vocabularies used between datasets
- Data is often not cleaned
This diversity is also evident across geographies and countries, as local and national healthcare reporting systems frequently use different data models and ontologies.
These differences in the way that data is stored or described create challenges for researchers preparing data for analyses:
- Inefficient and repetitive tasks
Researchers spend repeated time and effort cleaning and harmonising data before analysis. - Coding and data expertise needed
Data cannot be easily used or cleaned/harmonised by people unfamiliar with the data or who lack the necessary coding expertise. - Limited analysis potential
Running analysis queries on unstructured data or across data in different file formats is not possible.
The solution
Adopting a common data model
Lifebit’s Data Transformation Suite is used by leading organisations across healthcare and life sciences to accelerate scientific discoveries.
It is harnessing the power of data for Genomics England, the UK government’s flagship initiative to improve the genomic health of the population, and is accelerating research insights for academia, healthcare and pharmaceutical companies across the world.
Transforming data to OMOP
Lifebit is certified by the European Heath Data Evidence Network (EHDEN) in transforming data to the Observational Medical Outcomes Partnership (OMOP) common data model.
This is harmonising disparate data sources, transforming them into a common format and using a standard set of vocabularies so they can be analysed using a library of standard analytic pipelines.
Lifebit’s Data Transformation Suite uses a set of pipelines that transform raw data to analysis-ready data. These pipelines are automated yet flexible, and built to accommodate new data types over time.
Through the Data Transformation Suite, data is harmonised, mapped to existing standards, annotations and ontologies and then interlinked during data ingestion to produce a linked data graph. This process increases the interoperability and reusability of the data and overall actionability.
three key components of Lifebit’s technology
1
Pipeline Composer
2
ETL (Extraction, Transformation, Loading) Pipelines
3
Health Data knowledge base
Increasing data utility
creating analysis-ready data for research
and ingested into the platform database where it can be queried for research.
Interested in learning more about
Lifebit’s federated data solution
for genomics research?
At Lifebit, we develop secure federated data analysis solutions for clients including Genomics England, NIHR Cambridge Biomedical Research Centre, Danish National Genome Centre and Boehringer Ingelheim to help researchers turn data into discoveries.