phenotype2phenopacket
Details
GitHub | monarch-initiative/phenotype2phenopacket |
Language | Python |
Description | Phenotype2Phenopacket is a command-line tool that converts a phenotype annotation into GA4GH Phenopackets, facilitating standardised phenotypic data representation. |
Dependencies
External Dependencies
Package | Version |
---|---|
python | >=3.10,<4.0 |
pandas | ^2.0.0 |
phenopackets | ^2.0.2 |
polars | ^0.19 |
oaklib | ^0.5.1 |
click | ^8.1.3 |
pheval | ^0.4.0 |
Documentation
Phenotype2Phenopacket
Phenotype2Phenopacket is a command-line interface (CLI) application for the construction of phenopackets from a phenotype annotation file.
Installation
Phenotype2Phenopacket can be installed from PyPi.
pip install phenotype2phenopacket
Usages
To convert all OMIM diseases in a phenotype annotation file to disease phenopackets, where all phenotypes are retained:
p2p convert --phenotype-annotation /path/to/phenotype.hpoa --output-dir /path/to/output-dir
To create synthetic patient disease phenopackets, where the dataset is more variable and frequencies are taken into account and constrained noise is applied :
p2p create --phenotype-annotation /path/to/phenotype.hpoa --output-dir /path/to/output-dir
You can also limit the number of disease phenopackets converted/created:
p2p convert --phenotype-annotation /path/to/phenotype.hpoa --output-dir /path/to/output-dir --num-diseases 100
Or limit for a specific OMIM disease:
p2p create --phenotype-annotation /path/to/phenotype.hpoa --output-dir /path/to/output-dir --omim-id OMIM:619340
Or limit for a list of OMIM IDs specified in a text file, with each ID separated by a new line:
p2p create --phenotype-annotation /path/to/phenotype.hpoa --output-dir /path/to/output-dir --omim-id-list /path/to/list.txt
To add known gene-to-phenotype relationships to phenopackets:
p2p add-genes --phenopacket-dir /path/to/synthetic-phenopackets --genes-to-disease /path/to/genes_to_disease.txt --hgnc-data /path/to/hgnc_complete_set.txt --output-dir /path/to/output-dir
NOTE: To add known gene-to-phenotype the genes_to_disease.txt is expected. It can be downloaded here.