FAQ | PMBB Browser

Q: How should I cite the browser?

Penn Medicine BioBank Allele Frequency Browser,
https://pmbb.med.upenn.edu/biobank/allele-frequency/

Q: What are the details of the study population?

The Penn Medicine BioBank (PMBB) is a research program that recruits participants throughout the University of Pennsylvania Health System (5.7 million outpatient visits in the fiscal year 2018) by enrolling at the time of outpatient visits. Patients participate by completing a questionnaire, donating a blood sample, and allowing researchers access to their Electronic Health Record (EHR) information. The PMBB has recruited ~60,000 participants to date. Nearly 25% of PMBB participants (n ~15,000) are of African ancestry, consistent with the greater Philadelphia catchment area, making the PMBB one of the largest single institutional academic biobanks of African ancestry individuals in the country. Approximately 20,000 of these participants have already been genotyped (imputed to 1KG Phase3 dataset using Michigan Imputation Server) and 12,000 have whole exome sequence data. These are data from the PMBB-Release-2017.

Q: Who are the contributors?

Penn Medicine BioBank Team:

Regeneron Genetics Center

Q: What genome build was used for the whole exome data in the browser?

The genetic data included in the PMBB allele frequency browser are in genome build Genome Reference Consortium Human Build 37 (GRCh37), also known as hg19.

Q: What are the clinical characteristics?

We derive the clinical information about the PMBB participants from the Electronic Health Record (EHR). The EHR at Penn Medicine is based in Epic and is used by all Penn Medicine clinicians with consistent clinical definitions, formularies, clinical decision support, and other features. PennChart, the branded Epic product, now supports all 2,000 physicians practicing throughout the Penn health system. Penn Data Store (PDS), Penn Medicine’s clinical data warehouse, contains over 6 million patient records and other discrete clinical information amalgamated from 12 different source systems throughout the enterprise. The PDS uses standardized language from national coding systems including SNOMED, LOINC, and RxNORM; all data are being modeled in the OMOP data model.

Population Characteristics of Penn Medicine BioBank

		Genotype Chip	Exome Sequence
Total Patients		19,515	10,900
Female (%)		7856 (41%)	4432 (40.7%)
Median Age (at enrollment), yr		66	67
Body mass index		30.02 (13 - 83)
Race	European American	11580	8198
	African American	5988	2172
	Ad Mixed American	795	304
	East Asian	79	79
	South East Asian	114	114
	Other	575	33

Q: What is the population structure?

Here we show the first two principal components of ancestry, generated using common variants. Colors indicate ancestry groups. AFR = African; AMR = Native American, EAS = East Asian; EUR = European, SAS = South Asian

Q: Are individuals in the cohort related?

Relatedness between the individuals was estimated using identify-by-descent. We removed the related individuals with pi-hat threshold of 0.25 which account for relatives up-to first cousins.