Skip to content

Update phenopacket

create_updated_phenopacket(gene_identifier, phenopacket_path, output_dir)

Update the gene context within the interpretations for a Phenopacket and writes the updated Phenopacket.

Parameters:

Name Type Description Default
gene_identifier str

Identifier used to update the gene context.

required
phenopacket_path Path

The path to the input Phenopacket file.

required
output_dir Path

The directory where the updated Phenopacket will be written.

required
Notes

The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace to describe the gene identifiers.

Source code in src/pheval/prepare/update_phenopacket.py
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
def create_updated_phenopacket(
    gene_identifier: str, phenopacket_path: Path, output_dir: Path
) -> None:
    """
    Update the gene context within the interpretations for a Phenopacket and writes the updated Phenopacket.

    Args:
        gene_identifier (str): Identifier used to update the gene context.
        phenopacket_path (Path): The path to the input Phenopacket file.
        output_dir (Path): The directory where the updated Phenopacket will be written.
    Notes:
        The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id
        to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace
        to describe the gene identifiers.
    """
    hgnc_data = create_hgnc_dict()
    updated_phenopacket = update_outdated_gene_context(phenopacket_path, gene_identifier, hgnc_data)
    write_phenopacket(updated_phenopacket, output_dir.joinpath(phenopacket_path.name))

create_updated_phenopackets(gene_identifier, phenopacket_dir, output_dir)

Update the gene context within the interpretations for a directory of Phenopackets and writes the updated Phenopackets.

Parameters:

Name Type Description Default
gene_identifier str

Identifier used to update the gene context.

required
phenopacket_dir Path

The path to the input Phenopacket directory.

required
output_dir Path

The directory where the updated Phenopackets will be written.

required
Notes

The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace to describe the gene identifiers.

Source code in src/pheval/prepare/update_phenopacket.py
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
def create_updated_phenopackets(
    gene_identifier: str, phenopacket_dir: Path, output_dir: Path
) -> None:
    """
    Update the gene context within the interpretations for a directory of Phenopackets
    and writes the updated Phenopackets.

    Args:
        gene_identifier (str): Identifier used to update the gene context.
        phenopacket_dir (Path): The path to the input Phenopacket directory.
        output_dir (Path): The directory where the updated Phenopackets will be written.
    Notes:
        The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id
        to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace
        to describe the gene identifiers.
    """
    hgnc_data = create_hgnc_dict()
    for phenopacket_path in all_files(phenopacket_dir):
        updated_phenopacket = update_outdated_gene_context(
            phenopacket_path, gene_identifier, hgnc_data
        )
        write_phenopacket(updated_phenopacket, output_dir.joinpath(phenopacket_path.name))

update_outdated_gene_context(phenopacket_path, gene_identifier, hgnc_data)

Update the gene context of the Phenopacket.

Parameters:

Name Type Description Default
phenopacket_path Path

The path to the Phenopacket file.

required
gene_identifier str

Identifier to update the gene context.

required
hgnc_data defaultdict

The HGNC data used for updating.

required

Returns:

Type Description
Union[Phenopacket, Family]

Union[Phenopacket, Family]: The updated Phenopacket or Family.

Notes

This function updates the gene context within the Phenopacket or Family instance. The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace to describe the gene identifiers.

Source code in src/pheval/prepare/update_phenopacket.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def update_outdated_gene_context(
    phenopacket_path: Path, gene_identifier: str, hgnc_data: defaultdict
) -> Union[Phenopacket, Family]:
    """
    Update the gene context of the Phenopacket.

    Args:
        phenopacket_path (Path): The path to the Phenopacket file.
        gene_identifier (str): Identifier to update the gene context.
        hgnc_data (defaultdict): The HGNC data used for updating.

    Returns:
        Union[Phenopacket, Family]: The updated Phenopacket or Family.
    Notes:
        This function updates the gene context within the Phenopacket or Family instance.
        The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id
        to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace
        to describe the gene identifiers.
    """
    phenopacket = phenopacket_reader(phenopacket_path)
    interpretations = PhenopacketUtil(phenopacket).interpretations()
    updated_interpretations = GeneIdentifierUpdater(
        hgnc_data=hgnc_data, gene_identifier=gene_identifier
    ).update_genomic_interpretations_gene_identifier(interpretations, phenopacket_path)
    return PhenopacketRebuilder(phenopacket).update_interpretations(updated_interpretations)

update_phenopackets(gene_identifier, phenopacket_path, phenopacket_dir, output_dir)

Update the gene identifiers in either a single phenopacket or a directory of phenopackets.

Parameters:

Name Type Description Default
gene_identifier str

The gene identifier to be updated.

required
phenopacket_path Path

The path to a single Phenopacket file.

required
phenopacket_dir Path

The directory containing multiple Phenopacket files.

required
output_dir Path

The output directory to save the updated Phenopacket files.

required
Notes

The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace to describe the gene identifiers.

Source code in src/pheval/prepare/update_phenopacket.py
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
def update_phenopackets(
    gene_identifier: str, phenopacket_path: Path, phenopacket_dir: Path, output_dir: Path
) -> None:
    """
    Update the gene identifiers in either a single phenopacket or a directory of phenopackets.

    Args:
        gene_identifier (str): The gene identifier to be updated.
        phenopacket_path (Path): The path to a single Phenopacket file.
        phenopacket_dir (Path): The directory containing multiple Phenopacket files.
        output_dir (Path): The output directory to save the updated Phenopacket files.
    Notes:
        The gene_identifier parameter should be chosen from ensembl_id, hgnc_id, or entrez_id
        to update to the current gene identifier in the Phenopacket. We recommend using the ENSEMBL namespace
        to describe the gene identifiers.
    """
    output_dir.mkdir(exist_ok=True)
    if phenopacket_path is not None:
        create_updated_phenopacket(gene_identifier, phenopacket_path, output_dir)
    elif phenopacket_dir is not None:
        create_updated_phenopackets(gene_identifier, phenopacket_dir, output_dir)