Lifebit logo
BlogIndustryCanada’s National Federated Research Infrastructure: What It Is, Why It Matters, and How It Works

Canada’s National Federated Research Infrastructure: What It Is, Why It Matters, and How It Works

Canada has some of the world’s richest population health data. More than 350,000 volunteer participants across multiple regional cohorts. Decades of longitudinal follow-up. Genomic, environmental, lifestyle, and clinical data linked at a national scale. The research community that built this resource did so because they believed that understanding how genetics, environment, and behaviour interact to cause chronic disease requires breadth—the kind of breadth that only comes from studying a country, not a single institution.

But for years, the infrastructure to actually work with this data at scale has lagged behind the ambition. Researchers who wanted to run analyses across multiple regional cohorts faced a familiar gauntlet: data sharing agreements, cross-jurisdictional governance reviews, the legal and operational overhead of moving sensitive participant data between institutions. Studies that should have taken months took years. Some never happened at all.

That gap is now being addressed. Lifebit and the Canadian Partnership for Tomorrow’s Health (CanPath) have established Canada’s National Federated Research Infrastructure framework—a shared approach that lets researchers work across institutions, at national scale, without centralizing or moving sensitive data. This post explains what that infrastructure is, how it was built, and what it means for Canadian health research going forward.

Why Canada Needed a Different Approach

The core tension in population health research is structural, not technical. National studies need large, diverse datasets. But health data is sensitive. Participants consent to specific uses. Institutions carry legal and ethical obligations. Data residency requirements, particularly in Canada, mean that health information collected from Canadians needs to stay in Canada and under Canadian governance.

The traditional response to this tension was centralization: negotiate a data sharing agreement, transfer everything into one place, and run your analyses on the combined dataset. In theory, it works. In practice, the process is so slow and legally complex that it often takes longer than the research itself. And even when it succeeds, centralization means that data custodians give up a degree of control over how their participants’ data is used—a trade that many institutions, rightly, are reluctant to make.

CanPath had already solved part of this problem. By creating a harmonized national dataset—standardizing how data from regional cohorts is structured, described, and governed—CanPath made it possible, in principle, to study health patterns across Canada. But harmonization alone doesn’t create the computational infrastructure for running large-scale analyses securely, compliantly, and at speed. That piece was missing.

The CanPath–Lifebit framework is built to fill exactly that gap.

What Federated Research Infrastructure Actually Is

Federated research infrastructure is a model where data stays where it is—in the institutions that collected it, under the governance of the people responsible for it—but researchers can run analyses across multiple sites simultaneously.

Instead of pulling participant records into a central database, the analytical tools travel to the data. Each institution’s environment runs the analysis locally. The results—aggregates, statistical outputs, model outputs—are what come back to the researcher. Raw data never leaves. This isn’t a legal workaround. It’s a fundamentally different architecture, designed from the ground up for environments where data can’t and shouldn’t move.

The practical implications are significant. A researcher can submit an analytical query that runs simultaneously across multiple regional cohorts, each governed by different institutional policies, and receive a combined result—without any of those institutions releasing participant-level records. Cross-institutional research that previously required years of legal negotiation becomes operationally feasible. The governance is enforced by design, not by contract.

Federation as a concept isn’t new. Making it work in practice, at national scale, with the security, compliance, and governance standards that real health research institutions require, is the hard part. That’s what the CanPath–Lifebit framework actually delivers.

What Was Built—and How

The framework combines three elements that need to work together for federated research infrastructure to function in practice.

Secure cloud environments hosted within Canada. The infrastructure runs within Canadian-hosted cloud environments that meet national data residency, privacy, and security requirements. Research organizations can run computationally intensive analyses—genomics, epidemiology, longitudinal cohort studies—on high-performance infrastructure without building or maintaining their own systems, and without data leaving Canadian jurisdiction. A national cloud infrastructure partner provides the underlying computing environments, enabling institutions to scale without separate capital investment in technical systems.

Federated governance, not just federated technology. Each participating organization retains full control over who can access its data and how it is used. The framework doesn’t just federate the technical infrastructure—it federates the governance. This distinction matters. Institutions join because they keep control, not because they surrender it. Regional cohorts within CanPath maintain their own oversight and decision-making authority over their data. They participate in national analyses because they choose to, under governance conditions they set.

A shared foundation for interoperability. The CanPath–Lifebit model uses standardized data dictionaries, harmonized consent language, shared governance frameworks, and compatible technical standards. This means that as new partners join—other population cohorts, biobanks, sequencing initiatives, clinical research networks—they can connect to the national infrastructure without replacing their existing systems. The goal is a growing network, not a fixed one.

Where It Stands Today

An initial research environment is already live. It includes a synthetic dataset—a realistic but deliberately simulated version of the real data—that allows researchers to test tools, workflows, and queries safely before real participant data is onboarded. This is an intentional and important step. It lets partners validate their analytical approaches, identify gaps, and prepare for collaboration before any real participant information is involved.

Phased onboarding of CanPath’s participating regional cohorts is underway and will continue throughout 2026.

“CanPath is built on a harmonized national dataset that already allows researchers to study large, multimodal health data across regions,” said Dr. Philip Awadalla, CanPath National Scientific Co-Director. “This new framework strengthens that foundation by providing a secure environment for analysis and enabling collaboration not only across Canada, but with international population health initiatives as well.”

The live synthetic environment is particularly significant for researchers planning future studies. It compresses the gap between designing a study and running it. Researchers can validate that their analytical pipeline works—that the data structure matches their assumptions, that the tools are compatible with the environment—without waiting for data access approval. When real data is onboarded, the preparatory work is already done.

What This Means for Researchers

For researchers working in population health, epidemiology, genomics, and related fields, the practical implications are worth spelling out clearly.

Larger effective sample sizes. Running analyses across multiple regional cohorts within CanPath means more statistical power and more representative results. Findings that were previously limited by single-cohort sample sizes become achievable.

Faster study execution. A live synthetic environment means researchers can test workflows and queries before the full data access process is complete. Studies that previously spent months in preparatory work can compress that timeline significantly.

International collaboration without data leaving Canada. The federated model is designed to extend beyond Canadian borders. Approved international studies can connect to Canadian cohorts and run analyses without any data crossing national boundaries. This makes Canada’s national research infrastructure compatible with global population health initiatives while maintaining Canadian data sovereignty.

Reduced administrative friction. Shared governance frameworks and standardized data dictionaries reduce the institutional back-and-forth that currently slows cross-site research. When the governance framework is shared, the time spent negotiating individual data sharing agreements shrinks.

A complete research analytics environment. The infrastructure is designed as a full environment for research—tools, workflows, governance, compute—not just a data storage solution. Researchers work within a purpose-built environment rather than assembling their own toolchain from separate components.

What This Means for Other Organisations

The CanPath–Lifebit partnership is the first implementation of this framework, but it is explicitly designed to grow. The infrastructure supports a national network that can expand to include other population cohorts, biobanks, sequencing programs, hospital research networks, and other data custodians. Any organization that adopts compatible governance and technical standards can connect without replacing what it already has.

This matters because Canada’s health research ecosystem, like those of most countries, is fragmented. Every institution building its own research infrastructure separately creates incompatible systems, duplicated investment, and data environments that can’t easily connect. A shared national foundation addresses this systematically rather than institution by institution.

“Canada does not need another pilot or prototype—we need infrastructure that works at scale, today,” said Maria Dunford, CEO at Lifebit.

The framework is also designed with long-term sustainability in mind. By establishing common approaches—shared data dictionaries, harmonized consent language, compatible technical standards—it creates conditions where new partners can join without disrupting what is already in place. Each new participant strengthens the network rather than requiring it to be rebuilt around them.

“The goal is not to centralize data,” added Dunford. “It is to make collaboration easier and more sustainable across Canada’s research community.”

Building Infrastructure That Lasts

Population health research moves slowly by design. Recruitment takes years. Follow-up takes decades. The datasets that emerge from this work are irreplaceable—you cannot go back and recollect twenty years of longitudinal health data. The infrastructure that supports this research needs to reflect that permanence.

What CanPath and Lifebit have built is not a pilot project or a demonstration system. It is production infrastructure, designed with the governance, security, and interoperability standards that national research programs require to operate reliably over time. The synthetic environment that is live today is the foundation. The phased onboarding of real cohort data throughout 2026 is the next step. The expansion to new partners and new use cases follows from there.

“We’re creating and sharing the conditions for collaboration at a national scale,” said Dr. Awadalla. “As more partners join, the benefits grow for researchers, institutions, and ultimately for Canadians.”

For organisations looking to understand how federated infrastructure applies to their own research environments—or exploring how to contribute to Canada’s national research network—this framework offers a working blueprint for what secure, scalable, governance-first research infrastructure looks like in practice.

Lifebit powers federated research infrastructure across more than 30 countries, supporting programmes that manage over 275 million patient records. Get in touch to learn more about how federated infrastructure can work for your organisation, or learn more about our role in the CanPath federation in our press release.


Federate & Discover Everything. Move Nothing.


United Kingdom

3rd Floor Suite, 207 Regent Street, London, England, W1B 3HH United Kingdom

USA
228 East 45th Street Suite 9E, New York, NY United States

© 2026 Lifebit Biotech Inc. DBA Lifebit. All rights reserved.

By using this website, you understand the information being presented is provided for informational purposes only and agree to our Cookie Policy and Privacy Policy.