Screenshot instructions:
Windows
Mac
Red Hat Linux
Ubuntu
Click URL instructions:
Rightclick on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)
From: Braisted, John C. <braisted@jc...>  20090618 15:22:58
Attachments:
Message as HTML

Hi Eleanor et al., I've committed this class with a small but important patch. This class uses a Sterling approximation for large factorials when computing the Fisher Exact probability for a 2x2 contingency matrix. NIAID and their group provided this code to support EASE. The code is originally from Bell Labs. The Nonpar Fisher exact provides a one and a two tailed probability. For the two tailed part, the code iterates over matrices that are equal or less extreme than the observed and sums probabilities that are less than or the 'same' as the original matrix. All computations are done as double precision numbers. The problem comes when you encounter/consider the transpose of the originally observed matrix when considering the other tail of the probability. The approximation for pvalue should be exactly the same but it varies out at the very end. I'm now just casting it to (float) to make the probability comparison so that this discrepancy (last digit of a double precision value) doesn't affect the result. There may be other ways to handle this like recognizing the transposed matrix corresponding to the original matrix and adding it's probability regardless of the fact that it's just a bit larger due to the approximation error at the limits of machine probability. Note that in many cases the transpose of the original matrix behaves and is the same but I have found instances where this fails. Here's the code that has been committed: http://mevtm4.svn.sourceforge.net/viewvc/mevtm4/trunk/source/org/tigr/ microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbabi lity.java?view=log Note that EASE doesn't care about/compute the two tailed test since we only care about over representation in the cluster (one sided). Therefore, the FE in EASE doesn't have the twotailed test and doesn't need this patch. John John Braisted Senior Software Engineer Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 
From: Braisted, John C. <braisted@jc...>  20090618 15:28:56
Attachments:
Message as HTML

Sterling approximation should have been Stirling's approximation.... but you get the point.... John Braisted Senior Software Engineer Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 From: Braisted, John C. Sent: Thursday, June 18, 2009 11:23 AM To: 'mevtm4devel@...'; mev; Mev Harvard Subject: Committed NonparHypergeometricProbability.java to the trunk area of SF SVN Hi Eleanor et al., I've committed this class with a small but important patch. This class uses a Sterling approximation for large factorials when computing the Fisher Exact probability for a 2x2 contingency matrix. NIAID and their group provided this code to support EASE. The code is originally from Bell Labs. The Nonpar Fisher exact provides a one and a two tailed probability. For the two tailed part, the code iterates over matrices that are equal or less extreme than the observed and sums probabilities that are less than or the 'same' as the original matrix. All computations are done as double precision numbers. The problem comes when you encounter/consider the transpose of the originally observed matrix when considering the other tail of the probability. The approximation for pvalue should be exactly the same but it varies out at the very end. I'm now just casting it to (float) to make the probability comparison so that this discrepancy (last digit of a double precision value) doesn't affect the result. There may be other ways to handle this like recognizing the transposed matrix corresponding to the original matrix and adding it's probability regardless of the fact that it's just a bit larger due to the approximation error at the limits of machine probability. Note that in many cases the transpose of the original matrix behaves and is the same but I have found instances where this fails. Here's the code that has been committed: http://mevtm4.svn.sourceforge.net/viewvc/mevtm4/trunk/source/org/tigr/ microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbabi lity.java?view=log Note that EASE doesn't care about/compute the two tailed test since we only care about over representation in the cluster (one sided). Therefore, the FE in EASE doesn't have the twotailed test and doesn't need this patch. John John Braisted Senior Software Engineer Pathogen Functional Genomics Resource Center (PFGRC) J. Craig Venter Institute 9704 Medical Center Drive Rockville, MD 20850 
From: Eleanor Howe <eleanora@ji...>  20090618 21:34:40

Hi John, Thanks for the fix! I'll propagate it to our bugfix branch and it will go out with v4.4.1 at the end of the month. Eleanor Braisted, John C. wrote: > > Hi Eleanor et al., > > I’ve committed this class with a small but important patch. This class > uses a Sterling approximation for large factorials when computing the > Fisher Exact probability for a 2x2 contingency matrix. NIAID and their > group provided this code to support EASE. The code is originally from > Bell Labs. > > The Nonpar Fisher exact provides a one and a two tailed probability. > For the two tailed part, the code iterates over matrices that are > equal or less extreme than the observed and sums probabilities that > are less than or the ‘same’ as the original matrix. All computations > are done as double precision numbers. The problem comes when you > encounter/consider the transpose of the originally observed matrix > when considering the other tail of the probability. The approximation > for pvalue should be exactly the same but it varies out at the very > end. I’m now just casting it to (float) to make the probability > comparison so that this discrepancy (last digit of a double precision > value) doesn’t affect the result. There may be other ways to handle > this like recognizing the transposed matrix corresponding to the > original matrix and adding it’s probability regardless of the fact > that it’s just a bit larger due to the approximation error at the > limits of machine probability. > > Note that in many cases the transpose of the original matrix behaves > and is the same but I have found instances where this fails. > > Here’s the code that has been committed: > > http://mevtm4.svn.sourceforge.net/viewvc/mevtm4/trunk/source/org/tigr/microarray/mev/cluster/algorithm/impl/nonpar/NonparHypergeometricProbability.java?view=log > > Note that EASE doesn’t care about/compute the two tailed test since we > only care about over representation in the cluster (one sided). > Therefore, the FE in EASE doesn’t have the twotailed test and doesn’t > need this patch. > > John > > John Braisted > Senior Software Engineer > Pathogen Functional Genomics Resource Center (PFGRC) > J. Craig Venter Institute > 9704 Medical Center Drive > Rockville, MD 20850 > 