Hi Eleanor et al.,

I’ve committed this class with a small but important
patch. This class uses a Sterling approximation for large factorials when
computing the Fisher Exact probability for a 2x2 contingency matrix.
NIAID and their group provided this code to support EASE. The code is
originally from Bell Labs.

The Nonpar Fisher exact provides a one and a two tailed probability.
For the two tailed part, the code iterates over matrices that are equal or less
extreme than the observed and sums probabilities that are less than or the ‘same’
as the original matrix. All computations are done as double precision
numbers. The problem comes when you encounter/consider the transpose of
the originally observed matrix when considering the other tail of the
probability. The approximation for p-value should be exactly the same but
it varies out at the very end. I’m now just casting it to (float)
to make the probability comparison so that this discrepancy (last digit of a
double precision value) doesn’t affect the result. There may be
other ways to handle this like recognizing the transposed matrix corresponding
to the original matrix and adding it’s probability regardless of the fact
that it’s just a bit larger due to the approximation error at the limits
of machine probability.

Note that in many cases the transpose of the original matrix
behaves and is the same but I have found instances where this fails.

Here’s the code that has been committed:

Note that EASE doesn’t care about/compute the two
tailed test since we only care about over representation in the cluster (one
sided). Therefore, the FE in EASE doesn’t have the two-tailed test
and doesn’t need this patch.

John

John Braisted

Senior Software Engineer

Pathogen Functional Genomics Resource Center (PFGRC)

J. Craig Venter Institute

9704 Medical Center Drive

Rockville, MD 20850