gpsea.preprocessing package
- gpsea.preprocessing.configure_caching_patient_creator(hpo: MinimalOntology, genome_build: str = 'GRCh38.p13', validation_runner: ValidationRunner | None = None, cache_dir: str | None = None, variant_fallback: str = 'VEP', timeout: float = 30.0) PhenopacketPatientCreator [source]
A convenience function for configuring a caching
PhenopacketPatientCreator
.To create the patient creator, we need hpo-toolkit’s representation of HPO. Other options are optional.
- Parameters:
hpo – a HPO instance.
genome_build – name of the genome build to use, choose from {‘GRCh37.p13’, ‘GRCh38.p13’}.
validation_runner – an instance of the validation runner.
cache_dir – path to the folder where we will cache the results fetched from the remote APIs or None if the cache location should be determined as described in
get_cache_dir_path()
. In any case, the directory will be created if it does not exist (including non-existing parents).variant_fallback – the fallback variant annotator to use if we cannot find the annotation locally. Choose from
{'VEP'}
(just one fallback implementation is available at the moment).timeout – timeout in seconds for the REST APIs
- gpsea.preprocessing.configure_patient_creator(hpo: MinimalOntology, genome_build: str = 'GRCh38.p13', validation_runner: ValidationRunner | None = None, variant_fallback: str = 'VEP', validation: str = 'lenient', timeout: float = 30.0) PhenopacketPatientCreator [source]
A convenience function for configuring a non-caching
PhenopacketPatientCreator
.To create the patient creator, we need hpo-toolkit’s representation of HPO. Other options are optional
- Parameters:
hpo – a HPO instance.
genome_build – name of the genome build to use, choose from {‘GRCh37.p13’, ‘GRCh38.p13’}.
validation_runner – an instance of the validation runner. if the data should be cached in .gpsea_cache folder in the current working directory. In any case, the directory will be created if it does not exist (including non-existing parents).
variant_fallback – the fallback variant annotator to use if we cannot find the annotation locally. Choose from
{'VEP'}
(just one fallback implementation is available at the moment).timeout – timeout in seconds for the REST APIs
- gpsea.preprocessing.configure_caching_cohort_creator(hpo: MinimalOntology, genome_build: str = 'GRCh38.p13', validation_runner: ValidationRunner | None = None, cache_dir: str | None = None, variant_fallback: str = 'VEP', timeout: float = 30.0) CohortCreator[Phenopacket] [source]
A convenience function for configuring a caching
PhenopacketPatientCreator
.To create the patient creator, we need hpo-toolkit’s representation of HPO. Other options are optional.
- Parameters:
hpo – a HPO instance.
genome_build – name of the genome build to use, choose from {‘GRCh37.p13’, ‘GRCh38.p13’}.
validation_runner – an instance of the validation runner.
cache_dir – path to the folder where we will cache the results fetched from the remote APIs or None if the cache location should be determined as described in
get_cache_dir_path()
. In any case, the directory will be created if it does not exist (including non-existing parents).variant_fallback – the fallback variant annotator to use if we cannot find the annotation locally. Choose from
{'VEP'}
(just one fallback implementation is available at the moment).timeout – timeout in seconds for the REST APIs
- gpsea.preprocessing.configure_cohort_creator(hpo: MinimalOntology, genome_build: str = 'GRCh38.p13', validation_runner: ValidationRunner | None = None, variant_fallback: str = 'VEP', timeout: float = 30.0) CohortCreator[Phenopacket] [source]
A convenience function for configuring a non-caching
PhenopacketPatientCreator
.To create the patient creator, we need hpo-toolkit’s representation of HPO. Other options are optional
- Parameters:
hpo – a HPO instance.
genome_build – name of the genome build to use, choose from {‘GRCh37.p13’, ‘GRCh38.p13’}.
validation_runner – an instance of the validation runner. if the data should be cached in .cache folder in the current working directory. In any case, the directory will be created if it does not exist (including non-existing parents).
variant_fallback – the fallback variant annotator to use if we cannot find the annotation locally. Choose from
{'VEP'}
(just one fallback implementation is available at the moment).timeout – timeout in seconds for the VEP API
- gpsea.preprocessing.configure_default_protein_metadata_service(protein_source: Literal['UNIPROT'] = 'UNIPROT', cache_dir: str | None = None, timeout: float = 30.0) ProteinMetadataService [source]
Create default protein metadata service that will cache the protein metadata in current working directory under .gpsea_cache/protein_cache and reach out to UNIPROT REST API if a cache entry is missing.
- Parameters:
protein_source – a str with the code of the protein data sources (currently accepting just UNIPROT).
cache_dir – path to the folder where we will cache the results fetched from the remote APIs or None if the data should be cached as described by
get_cache_dir_path()
function. In any case, the directory will be created if it does not exist (including any non-existing parents).timeout – timeout in seconds for the REST APIs.
- gpsea.preprocessing.configure_protein_metadata_service(cache_dir: str | None = None, timeout: float = 30.0) ProteinMetadataService [source]
Configure default protein metadata service.
The service will cache the responses in cache_dir and reach out to UNIPROT API for cache misses.
- Parameters:
cache_dir – path to the folder where we will cache the results fetched from the remote APIs or None if the data should be cached in .gpsea_cache folder in the current working directory. In any case, the directory will be created if it does not exist (including any non-existing parents).
timeout – timeout in seconds for the REST APIs.
- class gpsea.preprocessing.VariantCoordinateFinder[source]
Bases:
Generic
[T
]- abstract find_coordinates(item: T) VariantCoordinates | None [source]
Try to find
VariantCoordinates
from an item of some sort.The variant coordinates may not be available all the time, and None may be returned.
- Raises:
ValueError – if there is an error of any kind.
- class gpsea.preprocessing.FunctionalAnnotator[source]
Bases:
object
- abstract annotate(variant_coordinates: VariantCoordinates) Sequence[TranscriptAnnotation] [source]
Compute functional annotations for the variant coordinates. The annotations can be empty.
Returns: a sequence of transcript annotations :raises ValueError if the annotation cannot proceed due to the remote resource being offline, etc.:
- class gpsea.preprocessing.ImpreciseSvFunctionalAnnotator[source]
Bases:
object
Annotator for large SVs that lack the exact breakpoint coordinates.
- abstract annotate(item: ImpreciseSvInfo) Sequence[TranscriptAnnotation] [source]
Compute functional annotations for a large SV.
Returns: a sequence of transcript annotations :raises ValueError if the annotation cannot proceed due to the remote resource being offline, etc.:
- class gpsea.preprocessing.ProteinMetadataService[source]
Bases:
object
A service for obtaining annotations for a given protein accession ID.
The annotations include elements of the
ProteinMetadata
class.- abstract annotate(protein_id: str) ProteinMetadata [source]
Prepare ProteinMetadata for a protein with given protein_id accession ID.
- Parameters:
protein_id (string) – A accession ID str (e.g. NP_001027558.1)
- Returns:
a ProteinMetadata container with the protein metadata
- Return type:
- Raises:
- class gpsea.preprocessing.PatientCreator[source]
Bases:
Generic
[T
],Auditor
[T
,Patient
]PatientCreator can create a Patient from some input T.
PatientCreator is an Auditor, hence the input is sanitized and any errors are reported to the caller.
- class gpsea.preprocessing.CohortCreator(patient_creator: PatientCreator[T])[source]
Bases:
Generic
[T
],Auditor
[Iterable
[T
],Cohort
]CohortCreator creates a cohort from an iterable of some T where T represents a cohort member.
- class gpsea.preprocessing.PhenopacketVariantCoordinateFinder(build: GenomeBuild, hgvs_coordinate_finder: VariantCoordinateFinder[str])[source]
Bases:
VariantCoordinateFinder
[GenomicInterpretation
]PhenopacketVariantCoordinateFinder figures out
VariantCoordinates
andGenotype
from GenomicInterpretation element of Phenopacket Schema.- Parameters:
build – genome build to use in VariantCoordinates
hgvs_coordinate_finder – the coordinate finder to use for parsing HGVS expressions
- find_coordinates(item: GenomicInterpretation) VariantCoordinates | None [source]
Tries to extract the variant coordinates from the GenomicInterpretation.
- Parameters:
item (GenomicInterpretation) – a genomic interpretation element from Phenopacket Schema
- Returns:
variant coordinates
- Return type:
- class gpsea.preprocessing.PhenopacketPatientCreator(build: GenomeBuild, phenotype_creator: PhenotypeCreator, functional_annotator: FunctionalAnnotator, imprecise_sv_functional_annotator: ImpreciseSvFunctionalAnnotator, hgvs_coordinate_finder: VariantCoordinateFinder[str], assume_karyotypic_sex: bool = True)[source]
Bases:
PatientCreator
[Phenopacket
]PhenopacketPatientCreator transforms Phenopacket into
Patient
.
- gpsea.preprocessing.load_phenopacket_folder(pp_directory: str, cohort_creator: CohortCreator[Phenopacket], validation_policy: Literal['none', 'lenient', 'strict'] = 'none') Tuple[Cohort, PreprocessingValidationResult] [source]
Load phenopacket JSON files from a directory, validate the patient data, and assemble the patients into a cohort.
A file with .json suffix is considered to be a JSON file and all JSON files are assumed to be phenopackets. Non-JSON files are ignored.
- Parameters:
pp_directory – path to a folder with phenopacket JSON files. An error is raised if the path does not point to a directory with at least one phenopacket.
cohort_creator – cohort creator for turning a sequence of phenopacket into a
Cohort
.validation_policy – a str with the validation policy. The value must be one of {‘none’, ‘lenient’, ‘strict’}
- Returns:
a tuple with the cohort and the validation result.
- gpsea.preprocessing.load_phenopacket_files(pp_files: Iterator[str], cohort_creator: CohortCreator[Phenopacket], validation_policy: Literal['none', 'lenient', 'strict'] = 'none') Tuple[Cohort, PreprocessingValidationResult] [source]
Load phenopacket JSON files, validate the data, and assemble into a
Cohort
.Phenopackets are validated, assembled into a cohort, and the validation results are reported back.
- Parameters:
pp_files – an iterator with paths to phenopacket JSON files.
cohort_creator – cohort creator for turning a phenopacket collection into a
Cohort
.validation_policy – a str with the validation policy. The value must be one of {‘none’, ‘lenient’, ‘strict’}
- Returns:
a tuple with the cohort and the validation result.
- gpsea.preprocessing.load_phenopackets(phenopackets: Iterator[Phenopacket], cohort_creator: CohortCreator[Phenopacket], validation_policy: Literal['none', 'lenient', 'strict'] = 'none') Tuple[Cohort, PreprocessingValidationResult] [source]
Validate the phenopackets and assemble into a
Cohort
.The results of the validation are reported back.
- Parameters:
phenopackets – path to a folder with phenopacket JSON files. An error is raised if the path does not point to a directory with at least one phenopacket.
cohort_creator – cohort creator for turning a sequence of phenopacket into a
Cohort
.validation_policy – a str with the validation policy. The value must be one of {‘none’, ‘lenient’, ‘strict’}
- Returns:
a tuple with the cohort and the validation result.
- class gpsea.preprocessing.TranscriptCoordinateService[source]
Bases:
object
TranscriptCoordinateService gets transcript (tx) coordinates for a given transcript ID.
- abstract fetch(tx: str | TranscriptInfoAware) TranscriptCoordinates [source]
Get tx coordinates for a tx ID or an entity that knows about the tx ID.
The method will raise an exception in case of an issue.
- Parameters:
tx – a str with tx ID (e.g. NM_002834.5) or an entity that knows about the transcript ID (e.g.
TranscriptAnnotation
).
Returns: the transcript coordinates.
- class gpsea.preprocessing.GeneCoordinateService[source]
Bases:
object
GeneCoordinateService gets transcript (Tx) coordinates for a gene ID.
- abstract fetch_for_gene(gene: str) Sequence[TranscriptCoordinates] [source]
Get Tx coordinates for a gene ID.
The method will raise an exception in case of an issue.
- Parameters:
gene – a str with tx ID (e.g. HGNC:3603)
- Returns:
a sequence of transcript coordinates for the gene.
- Return type:
- class gpsea.preprocessing.PhenotypeCreator(hpo: MinimalOntology, validator: ValidationRunner)[source]
Bases:
Auditor
[Iterable
[Tuple
[str
,bool
]],Sequence
[Phenotype
]]PhenotypeCreator validates the input phenotype features and prepares them for the downstream analysis.
The creator expects an iterable with tuples that contain a CURIE and status. The CURIE must correspond to a HPO term identifier and status must be a bool.
The creator prunes CURIES with simple errors such as malformed CURIE or non-HPO terms and validates the rest with HPO toolkit’s validator.
- class gpsea.preprocessing.ProteinAnnotationCache(datadir: str)[source]
Bases:
object
A class that stores or retrieves ProteinMetadata objects using pickle format
- get_annotations(protein_id
str): Searches a given data directory for a pickle file with given ID and returns ProteinMetadata
- store_annotations(protein_id
str, annotation:Sequence[ProteinMetadata]): Creates a pickle file with given ID and stores the given ProteinMetadata into that file
- get_annotations(protein_id: str) ProteinMetadata | None [source]
Searches a given data directory for a pickle file with given ID and returns ProteinMetadata from file. Returns None if no file is found.
- Parameters:
protein_id (string) – The protein_id associated with the desired ProteinMetadata
- store_annotations(protein_id: str, annotation: ProteinMetadata)[source]
Creates a pickle file with the given protein id in the file name. Loads the ProteinMetadata given into the file for storage.
- Parameters:
protein_id (string) – The protein_id associated with the ProteinMetadata
annotation (Sequence[ProteinMetadata]) – A sequence of ProteinMetadata objects that will be stored under the given protein id
- class gpsea.preprocessing.ProtCachingMetadataService(cache: ProteinAnnotationCache, fallback: ProteinMetadataService)[source]
Bases:
ProteinMetadataService
A class that retrieves ProteinMetadata if it exists or will run the fallback Fuctional Annotator if it does not exist.
- annotate(protein_id
str): Gets metadata and returns ProteinMetadata for given protein ID
- annotate(protein_id: str) ProteinMetadata [source]
Gets metadata for given protein ID
- Parameters:
protein_id (string) – A protein ID
- Returns:
A ProteinMetadata object
- Return type:
- class gpsea.preprocessing.UniprotProteinMetadataService(timeout: float = 30.0)[source]
Bases:
ProteinMetadataService
A class that creates ProteinMetadata objects from data found with the Uniprot REST API. More info on the Uniprot REST API are in the Programmatic access section.
- static parse_uniprot_json(payload: Mapping[str, Any], protein_id: str) ProteinMetadata [source]
Try to extract ProteinMetadata corresponding to protein_id from the Uniprot JSON payload.
- Parameters:
payload – a JSON object corresponding to Uniprot response
protein_id – a str with the accession the protein of interest
- Returns:
a complete instance of ProteinMetadata
- Raises:
- annotate(protein_id: str) ProteinMetadata [source]
Get metadata for given protein ID. This class specifically only works with a RefSeq database ID (e.g. NP_037407.4).
- Parameters:
protein_id (string) – A protein ID
- Returns:
A sequence of ProteinMetadata objects, or an empty sequence if no data was found.
- Return type:
Sequence[ProteinMetadata]
- class gpsea.preprocessing.VepFunctionalAnnotator(include_computational_txs: bool = False, timeout: float = 10.0)[source]
Bases:
FunctionalAnnotator
A FunctionalAnnotator that uses Variant Effect Predictor (VEP) REST API to do functional variant annotation.
- Parameters:
- NONCODING_EFFECTS = {VariantEffect.DOWNSTREAM_GENE_VARIANT, VariantEffect.FIVE_PRIME_UTR_VARIANT, VariantEffect.INTERGENIC_VARIANT, VariantEffect.INTRON_VARIANT, VariantEffect.NON_CODING_TRANSCRIPT_EXON_VARIANT, VariantEffect.NON_CODING_TRANSCRIPT_VARIANT, VariantEffect.SPLICE_ACCEPTOR_VARIANT, VariantEffect.SPLICE_DONOR_5TH_BASE_VARIANT, VariantEffect.SPLICE_DONOR_VARIANT, VariantEffect.SPLICE_POLYPYRIMIDINE_TRACT_VARIANT, VariantEffect.THREE_PRIME_UTR_VARIANT, VariantEffect.UPSTREAM_GENE_VARIANT}
Non-coding variant effects where we do not complain if the functional annotation lacks the protein effects.
- annotate(variant_coordinates: VariantCoordinates) Sequence[TranscriptAnnotation] [source]
Perform functional annotation using Variant Effect Predictor (VEP) REST API.
- Parameters:
variant_coordinates (VariantCoordinates) – A VariantCoordinates object
- Returns:
A sequence of transcript annotations for the variant coordinates
- Return type:
- Raises:
ValueError if VEP times out or does not return a response or if the response is not formatted as we expect. –
- process_response(variant_key: str, response: Mapping[str, Any]) Sequence[TranscriptAnnotation] [source]
- fetch_response(variant_coordinates: VariantCoordinates) Mapping[str, Any] [source]
Get a dict with the response from the VEP REST API. :param variant_coordinates: a query
VariantCoordinates
.
- static format_coordinates_for_vep_query(vc: VariantCoordinates) str [source]
Converts the 0-based VariantCoordinates to ones that will be interpreted correctly by VEP
Example - an insertion/duplication of G after the given G at coordinate 3: 1 2 3 4 5 A C G T A
0-based: 2 3 G GG 1-based: 3 G GG VEP: 4 3 - G
- Parameters:
vc (VariantCoordinates) – A VariantCoordinates object
- Returns:
The variant coordinates formatted to work with VEP
- Return type:
string
- class gpsea.preprocessing.VariantAnnotationCache(datadir: str)[source]
Bases:
object
A class that stores or retrieves Variant objects using pickle format
- get_annotations(variant_coordinates
VariantCoordinates): Searches a given data directory for a pickle file with variant coordinates and returns a Variant object
- store_annotations(variant_coordinates
VariantCoordinates, annotation:Variant): Creates a pickle file with variant coordinates and stores the given Variant object into that file
- get_annotations(variant_coordinates: VariantCoordinates) Sequence[TranscriptAnnotation] | None [source]
Searches a given data directory for a pickle file with given variant coordinates and returns Variant from file. Returns None if no file is found.
- Parameters:
variant_coordinates (VariantCoordinates) – The variant_coordinates associated with the desired Variant
- store_annotations(variant_coordinates: VariantCoordinates, annotations: Sequence[TranscriptAnnotation])[source]
Creates a pickle file with the given variant coordinates in the file name. Loads the Variant object given into the file for storage.
- Parameters:
variant_coordinates (VariantCoordinates) – The variant_coordinates associated with the desired Variant
annotations (Sequence[TranscriptAnnotation]) – Annotations that will be stored under the given variant coordinates
- class gpsea.preprocessing.VarCachingFunctionalAnnotator(cache: VariantAnnotationCache, fallback: FunctionalAnnotator)[source]
Bases:
FunctionalAnnotator
A class that retrieves a Variant object if it exists or will run the fallback Fuctional Annotator if it does not exist.
- annotate(variant_coordinates
VariantCoordinates): Gets data and returns a Variant object for given variant coordinates
- static with_cache_folder(fpath_cache_dir: str, fallback: FunctionalAnnotator)[source]
Create caching functional annotator that will store the data in fpath_cache_dir and use fallback to annotate the missing variants.
- annotate(variant_coordinates: VariantCoordinates) Sequence[TranscriptAnnotation] [source]
Gets Variant for given variant coordinates
- Parameters:
variant_coordinates (VariantCoordinates) – A VariantCoordinates object
- Returns:
A Variant object
- Return type:
- class gpsea.preprocessing.VVHgvsVariantCoordinateFinder(genome_build: GenomeBuild, timeout: int = 30)[source]
Bases:
VariantCoordinateFinder
[str
]VVHgvsVariantCoordinateFinder uses Variant Validator’s REST API to build
VariantCoordinates
from an HGVS string.The finder takes an HGVS str (e.g. NM_005912.3:c.253A>G) and extracts the variant coordinates from the response.
- Parameters:
genome_build – the genome build to use to construct
VariantCoordinates
timeout – the REST API request timeout
- find_coordinates(item: str) VariantCoordinates | None [source]
Extracts variant coordinates from an HGVS string using Variant Validator’s REST API.
- Parameters:
item – a hgvs string
- Returns:
variant coordinates
- class gpsea.preprocessing.VVMultiCoordinateService(genome_build: GenomeBuild, timeout: float = 30.0)[source]
Bases:
TranscriptCoordinateService
,GeneCoordinateService
VVMultiCoordinateService uses the Variant Validator REST API to fetch transcript coordinates for both a gene ID and a specific transcript ID.
- Parameters:
genome_build – the genome build for constructing the transcript coordinates.
timeout – a positive float with the REST API timeout in seconds.
- fetch(tx: str | TranscriptInfoAware) TranscriptCoordinates [source]
Get tx coordinates for a tx ID or an entity that knows about the tx ID.
The method will raise an exception in case of an issue.
- Parameters:
tx – a str with tx ID (e.g. NM_002834.5) or an entity that knows about the transcript ID (e.g.
TranscriptAnnotation
).
Returns: the transcript coordinates.
- fetch_for_gene(gene: str) Sequence[TranscriptCoordinates] [source]
Get Tx coordinates for a gene ID.
The method will raise an exception in case of an issue.
- Parameters:
gene – a str with tx ID (e.g. HGNC:3603)
- Returns:
a sequence of transcript coordinates for the gene.
- Return type:
- parse_response(tx_id: str, response) TranscriptCoordinates [source]
- class gpsea.preprocessing.Auditor[source]
Bases:
Generic
[IN
,OUT
]Auditor checks the inputs for sanity issues and relates the issues with sanitized inputs as
SanitationResults
.The auditor may sanitize the input as a matter of discretion and returns the input as OUT.
- static prepare_notepad(label: str) NotepadTree [source]
Prepare a
Notepad
for recording issues and errors.- Parameters:
label – a str with the top-level section label.
- Returns:
an instance of
NotepadTree
.- Return type:
- class gpsea.preprocessing.DataSanityIssue(level: Level, message: str, solution: str | None = None)[source]
Bases:
object
DataSanityIssue summarizes an issue found in the input data.
The issue has a level, a message with human-friendly description, and an optional solution for removing the issue.
- class gpsea.preprocessing.Level(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
An enum to represent severity of the
DataSanityIssue
.- WARN = 1
Warning is an issue when something not entirely right. However, unlike
Level.ERROR
, the analysis should complete albeit with sub-optimal results 😧.
- ERROR = 2
Error is a serious issue in the input data and the downstream analysis may not complete or the analysis results may be malarkey 😱.
- class gpsea.preprocessing.Notepad(label: str)[source]
Bases:
object
Record issues encountered during parsing/validation of a hierarchical data structure.
The issues can be organized in sections. Notepad keeps track of issues in one section and the subsections can be created by calling
add_subsection()
. The function returns an instance responsible for issues of a subsection.A collection of the issues from the current section are available via
issues
property and the convenience functions provide iterators over error and warnings.- abstract add_subsection(label: str) Notepad [source]
Add a labeled subsection.
- Returns:
a notepad for recording issues within the subsection.
- Return type:
- property issues: Sequence[DataSanityIssue]
Get an iterable with the issues of the current section.
- add_issue(level: Level, message: str, solution: str | None = None)[source]
Add an issue with certain level, message, and an optional solution.
- add_error(message: str, solution: str | None = None)[source]
A convenience function for adding an error with a message and an optional solution.
- add_warning(message: str, solution: str | None = None)[source]
A convenience function for adding a warning with a message and an optional solution.
- errors() Iterator[DataSanityIssue] [source]
Iterate over the errors of the current section.
- warnings() Iterator[DataSanityIssue] [source]
Iterate over the warnings of the current section.
- class gpsea.preprocessing.NotepadTree(label: str, level: int)[source]
Bases:
Notepad
NotepadTree implements
Notepad
using a tree where each tree node corresponds to a (sub)section. The node can have 0..n children.Each node has a
label
, a collection of issues, and children with subsections. For convenience, the node haslevel
to correspond to the depth of the node within the tree (the level of the root node is 0).The nodes can be accessed via
children
property or through convenience methods for tree traversal, either using the visitor pattern (visit()
) or by iterating over the nodes viaiterate_nodes()
. In both cases, the traversal is done in the depth-first fashion.- property children
- add_subsection(identifier: str)[source]
Add a labeled subsection.
- Returns:
a notepad for recording issues within the subsection.
- Return type:
- visit(visitor)[source]
Perform a depth-first search on the tree and call visitor with all nodes. :param visitor: a callable that takes the current node as a single argument.
- iterate_nodes()[source]
Iterate over nodes in the depth-first fashion.
Returns: a depth-first node iterator.
- has_warnings(include_subsections: bool = False) bool [source]
- Returns:
True if one or more warnings were found in the current section or its subsections.
- Return type:
- class gpsea.preprocessing.DefaultImpreciseSvFunctionalAnnotator(gene_coordinate_service: GeneCoordinateService)[source]
Bases:
ImpreciseSvFunctionalAnnotator
- annotate(item: ImpreciseSvInfo) Sequence[TranscriptAnnotation] [source]
Compute functional annotations for a large SV.
Returns: a sequence of transcript annotations :raises ValueError if the annotation cannot proceed due to the remote resource being offline, etc.: