From: Eleanor Howe <eleanora@ji...>  20090618 21:34:40

Hi John, Thanks for the fix! I'll propagate it to our bugfix branch and it will go out with v4.4.1 at the end of the month. Eleanor Braisted, John C. wrote: > > Hi Eleanor et al., > > I’ve committed this class with a small but important patch. This class > uses a Sterling approximation for large factorials when computing the > Fisher Exact probability for a 2x2 contingency matrix. NIAID and their > group provided this code to support EASE. The code is originally from > Bell Labs. > > The Nonpar Fisher exact provides a one and a two tailed probability. > For the two tailed part, the code iterates over matrices that are > equal or less extreme than the observed and sums probabilities that > are less than or the ‘same’ as the original matrix. All computations > are done as double precision numbers. The problem comes when you > encounter/consider the transpose of the originally observed matrix > when considering the other tail of the probability. The approximation > for pvalue should be exactly the same but it varies out at the very > end. I’m now just casting it to (float) to make the probability > comparison so that this discrepancy (last digit of a double precision > value) doesn’t affect the result. There may be other ways to handle > this like recognizing the transposed matrix corresponding to the > original matrix and adding it’s probability regardless of the fact > that it’s just a bit larger due to the approximation error at the > limits of machine probability. > > Note that in many cases the transpose of the original matrix behaves > and is the same but I have found instances where this fails. > > Here’s the code that has been committed: > > http://mevtm4.svn.sourceforge.net/viewvc/mevtm4/trunk/source/org/tigr/microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbability.java?view=log > > Note that EASE doesn’t care about/compute the two tailed test since we > only care about over representation in the cluster (one sided). > Therefore, the FE in EASE doesn’t have the twotailed test and doesn’t > need this patch. > > John > > John Braisted > Senior Software Engineer > Pathogen Functional Genomics Resource Center (PFGRC) > J. Craig Venter Institute > 9704 Medical Center Drive > Rockville, MD 20850 > 