ppktstore.model module

class ppktstore.model.PhenopacketInfo[source]

Bases: object

Phenopacket plus metadata.

abstract property path: str

Path of the phenopacket source relative from the enclosing cohort.

abstract property phenopacket: Phenopacket

The phenopacket.

class ppktstore.model.EagerPhenopacketInfo(path: str, phenopacket: Phenopacket)[source]

Bases: PhenopacketInfo

Phenopacket info with eagerly loaded phenopacket.

static from_path(path: str, pp_path: Path)[source]
property path: str

Path of the phenopacket source relative from the enclosing cohort.

property phenopacket: Phenopacket

The phenopacket.

class ppktstore.model.CohortInfo(name: str, path: str, phenopackets: Collection[PhenopacketInfo])[source]

Bases: object

Cohort of a Phenopacket store.

Includes cohort-level metadata and a sequence of phenopacket infos for the included phenopackets.

name: str

Cohort name, e.g. FBN1.

path: str

Path of the cohort relative from the enclosing source.

phenopackets: Collection[PhenopacketInfo]

The cohort phenopacket infos.

iter_phenopackets() Iterator[Phenopacket][source]

Get an iterator with all phenopackets of the cohort.

class ppktstore.model.PhenopacketStore[source]

Bases: object

PhenopacketStore provides the data and metadata for Phenopacket Store cohorts.

Use from_release_zip() or from_notebook_dir() to open a store instance.

static from_release_zip(zip_file: ZipFile, strategy: Literal['eager', 'lazy'] = 'eager')[source]

Read PhenopacketStore from a release ZIP archive.

The archive structure must match the structure of the ZIP archives created by ppktstore.archive.PhenopacketStoreArchiver. Only JSON phenopacket format is supported at the moment.

Strategy

The phenopackets can be loaded in an eager or lazy fashion. The ‘eager’ strategy will load all phenopackets during the load at the expense of the loading time and higher RAM usage. The ‘lazy’ strategy only scans the ZIP for phenopackets and the phenopacket parsing is done on demand, only when accessing the PhenopacketInfo.phenopacket property. In result, the lazy loading will only succeed if the ZIP handle is opened.

Note

We recommend using Python’s context manager to ensure zip_handle is closed:

>>> import zipfile
>>> with zipfile.ZipFile("all_phenopackets.zip") as zf:
...   ps = PhenopacketStore.from_release_zip(zf)
...   # Do things here...
param zip_file:

a ZIP archive handle.

param strategy:

a str with strategy for loading phenopackets, one of {‘eager’, ‘lazy’}.

returns:

PhenopacketStore with data read from the archive.

static from_notebook_dir(nb_dir: str, pp_dir: str = 'phenopackets')[source]

Create PhenopacketStore from Phenopacket store notebook dir nb_dir.

We expect the nb_dir to include a folder per cohort, and the phenopackets should be stored in nb_dir sub-folder (nb_dir=phenopackets by default).

The phenopackets are loaded eagerly into memory.

abstract property name: str

Get a str with the Phenopacket Store name. Most of the time, the name corresponds to the release tag (e.g. 0.1.18).

abstract property path: Path

Get path to the phenopacket store resource.

abstractmethod cohorts() Collection[CohortInfo][source]

Get a collection of all Phenopacket Store cohorts.

abstractmethod cohort_for_name(name: str) CohortInfo[source]

Retrieve a Phenopacket Store cohort by its name.

Parameters:

name – a str with the cohort name (e.g. SUOX).

Raises:

KeyError – if no cohort with such name exists.

iter_cohort_phenopackets(name: str) Iterator[Phenopacket][source]

Get an iterator with all phenopackets of a cohort.

Parameters:

name – a str with the cohort name.

cohort_names() Iterator[str][source]

Get an iterator with names of all Phenopacket Store cohorts.

cohort_count() int[source]

Compute the count of Phenopacket Store cohorts.

phenopacket_count() int[source]

Compute the total number of phenopackets available in Phenopacket Store.

class ppktstore.model.DefaultPhenopacketStore(name: str, path: Path, cohorts: Iterable[CohortInfo])[source]

Bases: PhenopacketStore

property name: str

Get a str with the Phenopacket Store name. Most of the time, the name corresponds to the release tag (e.g. 0.1.18).

property path: Path

Get path to the phenopacket store resource.

cohorts() Collection[CohortInfo][source]

Get a collection of all Phenopacket Store cohorts.

cohort_for_name(name: str) CohortInfo[source]

Retrieve a Phenopacket Store cohort by its name.

Parameters:

name – a str with the cohort name (e.g. SUOX).

Raises:

KeyError – if no cohort with such name exists.

class ppktstore.model.ZipPhenopacketInfo(path: str, pp_path: Path)[source]

Bases: PhenopacketInfo

Loads phenopacket from a Zip file on demand.

property path: str

Path of the phenopacket source relative from the enclosing cohort.

property phenopacket: Phenopacket

The phenopacket.