GPSEA
The concept of phenotype denote the observable attributes of an individual, but in medical contexts, the word “phenotype” is used to refer to some deviation from normal morphology, physiology, or behavior (c.f. Deep phenotyping for precision medicine). A key question in biology and human genetics concerns the relationships between phenotypic abnormalities and genotype. In Mendelian genetics, the focus is generally placed on the study of whether specific disease-causing alleles are associated with specific phenotypic manifestations of the disease.
GPSEA (Genotypes and Phenotypes - Statistical Evaluation of Associations, pronounced “G”-“P”-“C”) is a Python package designed to support genotype-phenotype correlation analysis. The input to GPSEA is a collection of Global Alliance for Genomics and Health (GA4GH) Phenopackets. gpsea ingests data from these phenopackets and performs analysis of the correlation of specific variants, variant types (e.g., missense vs. premature termination codon), or variant location in protein motifs or other features. The phenotypic abnormalities are represented by Human Phenotype Ontology (HPO) terms. Statistical analysis is performed using a Fisher Exact Test, and results are reported for each tested HPO term.
We recommend that users create a Jupyter notebook for each cohort of patients they would like to test.
This documentation includes installation instructions, a brief tutorial, and a comprehensive user guide. The technical information is available in API reference.
Literature
We provide recommended reading for background on the study of genotype-phenotype correlations.
Orgogozo V, et al. (2015) The differential view of genotype-phenotype relationships.
Feedback
The best place to leave feedback, ask questions, and report bugs is the GPSEA Issue Tracker.