gpsea.analysis.pcats.stats package

class gpsea.analysis.pcats.stats.CountStatistic(name: str)[source]

Bases: Statistic

CountStatistic calculates a p value for a contingency table produced by a pair of discrete random variables.

Supports shape

CountStatistic takes the counts in form of a data frame, and some statistics impose additional requirements on the frame shape. For instance, GPSEA’s implementation of the Fisher exact test can compare counts in a (2, 2) or (2, 3) arrays but χ2 test can test an (m, n) array.

It is important to check that a genotype/phenotype predicate produces the number of groups which the statistic can test.

The supports_shape returns a sequence with requirements on the shape of the data array/frame. The sequence includes the number of

Examples

Test

Array shape

supports_shape

Fisher Exact Test

(2, [2, 3])

(2, [2,3])

χ2

(*, *)

(None, None)

abstract property supports_shape: Sequence[int | Sequence[int] | None]

Get a sequence of the supported shapes.

abstract compute_pval(counts: DataFrame) float[source]
class gpsea.analysis.pcats.stats.FisherExactTest[source]

Bases: CountStatistic

FisherExactTest performs Fisher’s Exact Test on a 2x2 or 2x3 contingency table.

The 2x2 version is a thin wrapper around Scipy fisher_exact() function, while the 2x3 variant is implemented in Python. In both variants, the two-sided \(H_1\) is considered.

property supports_shape: Sequence[int | Sequence[int] | None]

Get a sequence of the supported shapes.

compute_pval(counts: DataFrame) float[source]