Loading Civic Trial Data...

Fetching and processing clinical trial demographics

Civic Trial

Striving for Diversity and Representation in Clinical Trials. Through reporting accountability.

Last updated:

2009 - 2026

Total Studies

-

Report Race

-

Report Ethnicity

-

Report Both

-

Reporting Trends Over Time

Race Distribution (NIH/OMB Categories)

Race Over Time

Proportion among studies that reported race (excludes missing and unknown).

Total Participants with Reported Race Data

Shows the total number of participants with explicitly reported race data per year (excludes "Unknown" and studies without race data).

Full Distribution with Data Quality

Proportion of total enrollment by category, distinguishing between explicitly unknown data and implicit missing (not reported) data.

Ethnicity Distribution

Note: The "Unknown or Not Reported" category is large because many studies do not collect ethnicity data or participants decline to report. This reflects limitations in data collection practices across clinical trials.

Ethnicity Over Time

Proportion among studies that reported ethnicity (excludes missing and unknown).

Total Participants with Reported Ethnicity Data

Shows the total number of participants with explicitly reported ethnicity data per year (excludes "Unknown" and studies without ethnicity data).

Full Distribution with Data Quality

Proportion of total enrollment by category, distinguishing between explicitly unknown data and implicit missing (not reported) data.

Sex Distribution

Sex Ratio Over Time

Gender Reporting

Note: Gender identity is rarely reported separately from biological sex in clinical trials.

Study Details

Click on any NCT ID to view the full trial on ClinicalTrials.gov. Click column headers to sort. Click checkmarks (✓) to view detailed demographic breakdowns.

NCT ID Details Study Start Date Study End Date Time to Report Study Status Results Posted Last Update Race Ethnicity Sex Title Phase Type Design Primary Endpoint Lead Sponsor Participants Publications
Map Layer:
0
Total Trials
0
Single-site
0
Multi-site
0
Location Not Reported

Trials by US State

Click a state to see city-level breakdown. Darker colors indicate higher values.

0
100

Regional Distribution

Trials by US Census Region

Site Distribution

Breakdown of trials by number of sites.

Geographic Reporting Over Time

Percentage of trials reporting location data by year.

Frequently Asked Questions

How are conditions categorized?

Conditions are categorized using a standardized medical hierarchy. We group specific conditions (e.g., "Congenital Heart Disease") into broader Primary Categories (e.g., "Cardiovascular"), with more granular Secondary Categories underneath. This reduces redundancy from synonyms (e.g., "congenital heart defect" and "congenital heart disease" map to the same secondary category) and allows for both broad and granular filtering.

The classification uses a two-step process:

  1. Exact/Substring Match: Each condition is checked against a curated list of keywords and synonyms, matched longest-first so specific terms (e.g., "heart failure") take priority over general ones (e.g., "heart").
  2. Fuzzy Match: If no exact match is found, lightweight fuzzy string matching (via rapidfuzz) catches typos and minor variations (e.g., "Type II Diabetes" vs "Type 2 Diabetes").
Primary Category Example Secondary Categories
Cardiovascular Heart Failure, Coronary Artery Disease, Arrhythmia, Hypertension, Congenital Heart Disease, Valvular Heart Disease, Cardiomyopathy, Peripheral Vascular Disease
Oncology Breast Cancer, Lung Cancer, Colorectal Cancer, Prostate Cancer, Hematologic Malignancy, Brain and CNS Tumors, Skin Cancer, Sarcoma
Neurology Alzheimer's Disease and Dementia, Parkinson's Disease, Epilepsy and Seizure Disorders, Multiple Sclerosis, Stroke and Cerebrovascular, Headache and Migraine
Respiratory COPD, Asthma, Pulmonary Fibrosis, Pneumonia, Pulmonary Hypertension, Sleep Apnea
Mental Health Depression, Anxiety Disorders, Bipolar Disorder, Schizophrenia and Psychotic Disorders, PTSD and Trauma, ADHD, Autism Spectrum, Eating Disorders
Endocrine and Metabolic Type 1 Diabetes, Type 2 Diabetes, Obesity, Thyroid Disorders, Lipid Disorders
Infectious Disease HIV/AIDS, Hepatitis, COVID-19, Tuberculosis, Influenza, Bacterial Infections, Parasitic Diseases
Autoimmune and Inflammatory Rheumatoid Arthritis, Systemic Lupus Erythematosus, Inflammatory Bowel Disease, Psoriasis and Psoriatic Arthritis, Vasculitis
Gastrointestinal GERD and Esophageal, Liver Disease, Irritable Bowel Syndrome, Pancreatic Disorders
Kidney and Urological Chronic Kidney Disease, End-Stage Renal Disease, Glomerular Diseases, Kidney Transplant, Urological Disorders
Musculoskeletal Osteoarthritis, Osteoporosis, Back and Spine, Fibromyalgia, Gout, Fractures and Trauma
Dermatology Eczema and Dermatitis, Psoriasis, Acne and Rosacea, Wound and Ulcer, Hair and Nail Disorders
Substance Use Disorders Alcohol Use Disorder, Opioid Use Disorder, Tobacco and Nicotine
Hematology Anemia, Coagulation Disorders, Thrombosis
Ophthalmology Macular Degeneration, Glaucoma, Diabetic Eye Disease
Reproductive and Sexual Health Infertility, Pregnancy Complications, Menopause and Hormonal
Transplant and Immunology Solid Organ Transplant, Bone Marrow Transplant, Allergy
Rare Diseases Cystic Fibrosis, Amyloidosis, Lysosomal Storage Disorders
Pain Chronic Pain, Acute Pain, Cancer Pain
Other Any condition not matching the above categories

Note: A study may have conditions spanning multiple categories. The dashboard filters show studies that match ANY of the keywords for the selected primary and/or secondary category. You can filter by primary category alone for broad analysis, or drill down to a specific secondary category for more targeted results.

How is the "Unknown/Not Reported" category calculated?

In the "Distribution Including Unknowns" charts, the Unknown category is calculated as:

Unknown = Total Enrollment - Sum(All Known Categories)

This ensures the chart always sums to exactly 100% of total enrollment, providing a complete picture of data completeness.

How is Funding Source derived?

Funding source is categorized based on sponsor information:

  • Industry: Lead Sponsor is Industry
  • NIH: Lead Sponsor is NIH, OR (Lead Sponsor is Other/Network AND any Collaborator is NIH)
  • Other U.S. Federal: Lead Sponsor is Federal, OR (Lead Sponsor is Other/Network AND any Collaborator is Federal)
  • Other: All other cases

Is this work currently funded?

No, but we are open to conversations about supporting this work. Shoot us an email info@civictrial.com

What led you to do this?

Many people claim that trials are not diverse. There are also many on-going initiatives to increase diversity in clinical trials. There are not many publicly available tools to assess the progress of those initiatives, or get a holistic view of trial diversity. We thought this would be a great start.

Where do you get this data from?

We used the clinicaltrials.gov API, it is a great resource and should be more widely used. Programs like the Aggregate Analysis of ClinicalTrials.gov (AACT) Database from the clinical trials transformation initiative. The demographic variables we display here are more difficult to parse compared to some of the more standardized variables (e.g. trial phase, total participants, etc.). We designed this dashboard based on our experience parsing some of these sociodemographic characteristics (namely race and ethnicity (pre-print here), and on-going projects examining sex, gender, and geography.

What about searching for specific trials and summarizing the information in other ways?

This project is currently focused on demographics surrounding clinical trials. There are other tools that do a great job at searching unstructured data from clinicaltrials.gov. There is a great connector for ClaudeCode built by the company deepsense.ai. More information on that connector is here.

How do you count trial sponsors?

We take a broad approach to capturing trial involvement. In our "Sponsor" filter, we count an organization if they are listed as either the Lead Sponsor or a Collaborator in the trial record. This allows us to capture the full ecosystem of organizations supporting a trial, rather than just the primary administrative entity.

Civic Trial is a tool created to investigate the demographic characteristics of participants in clinical trials. Centered on the ideal that diversity and representation in trials is important to prevent sample bias and improve study generalizability. This mission starts with understanding who is involved in studies, so we can move towards strategizing methods to improve representation.

Who built this

Maryam Aziz

Maryam Aziz is a Ph.D. candidate in Population Health Sciences at the Duke University School of Medicine and M.S. in Computer Science from Columbia. Maryam researches the ethical application of AI in healthcare, with a focus on women's health.

Michael D. Green, Ph.D.

Michael D. Green, Ph.D.

Michael is a Postdoctoral Researcher at the Department of Health, Behavior, and Society at the Johns Hopkins School of Public Health. Michael got his Ph.D. in Population Health Sciences at the Duke University School of Medicine, and a BA in Anthropology w/ honors from Dartmouth College. Michael's research focuses on unequal treatment in healthcare, specifically discrimination faced in healthcare settings.

Both hope to advance work to first establish a clear platform for accountability and transparency around the state of diversity in clinical trials, and second assist trial sponsors, investigators, and companies with approaches to diversify their trial population to strive for a representative trial.