gpsea.analysis.pcats.stats package

class gpsea.analysis.pcats.stats.CountStatistic(name: str)[source]

Bases: Statistic

CountStatistic calculates a p value for a contingency table produced by a pair of discrete random variables.

Supports shape

CountStatistic takes the counts in form of a data frame, and some statistics impose additional requirements on the frame shape. For instance, GPSEA’s implementation of the Fisher exact test can compare counts in a (2, 2) or (2, 3) arrays but χ2 test can test an (m, n) array.

It is important to check that a genotype/phenotype predicate produces the number of groups which the statistic can test.

The supports_shape returns a sequence with requirements on the shape of the data array/frame. The sequence includes the number of

Examples

Test	Array shape	supports_shape
Fisher Exact Test	`(2, [2, 3])`	`(2, [2,3])`
χ2	`(, )`	`(None, None)`

abstractmethod compute_pval(counts: DataFrame) → StatisticResult[source]

abstract property supports_shape: Sequence[int | Sequence[int] | None]: Get a sequence of the supported shapes.

class gpsea.analysis.pcats.stats.FisherExactTest[source]

Bases: CountStatistic

FisherExactTest performs Fisher’s Exact Test on a 2x2 or 2x3 contingency table.

The 2x2 version is a thin wrapper around Scipy fisher_exact() function, while the 2x3 variant is implemented in Python. In both variants, the two-sided \(H_1\) is considered.

compute_pval(counts: DataFrame) → StatisticResult[source]

property supports_shape: Sequence[int | Sequence[int] | None]: Get a sequence of the supported shapes.