Research Roundup: Data Expansion Bolsters Trove of COVID-19 Data

June 23, 2022
An infographic entitled “The Researcher Workbench.” COVID 19 Data Available. About 20,000 participants who have had SARS-CoV-2. More than 132,600 vaccine survey responses.

All of Us Research Program expands COVID-19 related data available to researchers.

All of Us Research Program’s dataset is poised to help researchers unlock answers related to long COVID, social determinants of health, health disparities, and more

Health data from nearly 20,000 people who have had SARS-CoV-2 is now available to researchers across the U.S., opening new opportunities to study COVID-19 disease prevention, progression, and recovery. These records are part of the National Institutes of Health's All of Us Research Program's expanded dataset encompassing clinical, genomic, and participant-reported information.

In addition to the COVID-19 data, the program has added more than 57,600 initial responses from its new social determinants of health (SDOH) survey to drive novel insights into how lived experiences affect health.

“The combination of data in the All of Us dataset—provided by participants from a wide range of communities and backgrounds—offers researchers an unprecedented resource to study how different aspects of our lives influence health outcomes,” said Josh Denny, M.D., M.S., chief executive officer of the All of Us Research Program.

COVID-19 Data

Among participants with a past SARS-CoV-2 infection, nearly 6,000 have whole genome sequences available, and 600 have shared Fitbit records. The dataset also includes custom survey information to allow researchers to better understand participants’ experiences throughout the pandemic, in addition to analyzing health outcomes from COVID-19 infection. The COVID-19 Participant Experience (COPE) Survey provides insights into the subtle and significant changes participants experienced in their daily lives during the tumultuous first year of the pandemic, as well as information about their mental and physical health. The updated All of Us dataset also includes more than 132,600 responses to the Minute Survey on COVID-19 Vaccines, spotlighting participants’ perspectives on vaccines and information about their vaccination status and plans.

Social Determinants of Health Data

The program has also bolstered its SDOH data, adding participant-reported information about neighborhood safety, access to food, experiences with health care discrimination, and daily work and living environments. These responses to the SDOH participant survey can be paired with select data from the Census Bureau’s American Community Survey linked to participants’ first three digits of their zip code, as well as the program’s other surveys on health care access, lifestyle, social support, and discrimination.

“Without a comprehensive view of health, researchers cannot fully understand the underpinnings of health and health equity,” said Cheryl Clark, M.D., Sc.D., assistant professor of medicine at Harvard Medical School and co-chair of the All of Us Research Program Social Determinants of Health Task Force. “All of Us can change the paradigm of health research by bringing together social, biological, and clinical data to create a more complete picture of the factors influencing our health so that we can begin to imagine a more equitable future.”

New File Types, Tools and Resources

In this release, the All of Us Data and Research Center has added sample-level aligned sequence data in the form of CRAM files, complementing the variant call data that was released in March. These files expand genomic research capabilities, such as custom variant evaluation and visualization with the newly added Integrative Genomics Viewer (IGV) tool. For researchers using genotyping array data, the Researcher Workbench now includes intensity files (iDat) for custom array analyses beyond variants.

“Our featured notebooks and workspaces create streamlined entry points to the dataset that researchers can use to test new hypotheses and accelerate their work,” said Paul Harris, PhD, principal investigator of the All of Us Data and Research Center, led by Vanderbilt University Medical Center. “These tools enable a cycle of replication that goes beyond any single institution and create a network of collaborators across the U.S.”

In total, the Researcher Workbench now includes data from more than 372,000 participants, nearly 80% of whom identify with groups historically underrepresented in medical research. All of Us also enlists diverse researchers and empowers them to study a range of scientific topics on the Researcher Workbench. The platform currently supports more than 2,300 registered researchers with more than 1,700 active projects. Nearly 85% of registered researchers consider themselves early-career researchers or are in similar roles, and nearly 55% identify with a group that is underrepresented in the medical research workforce.

This latest data refresh represents the second update to the Researcher Workbench this year, following the release of nearly 100,000 whole genome sequences and more than 165,000 genotyping arrays in March 2022. The next data release is planned for this winter, to include another large infusion of genomic data and more.

This article appears in the June 2022 issue of All of Us Research Roundup. Subscribe to receive future issues of the bimonthly researcher newsletter.

View the full June edition of the All of Us Research Roundup here.

All of Us is a registered service mark of the U.S. Department of Health & Human Services (HHS).