Anonymous
2010-12-19
Hi,
For some reason MeV gives different p-values for t-test than both Excel and R. For example, I have these 6 numbers:
GroupA: 179.7673133 109.2865837 155.4898853
GroupB: 109.6793997 100.727767 140.8898293
I load it to MeV (latest version but same results with older versions as well) as single color array. Then I select t-test and use "Between subjects" test with Welch approx. The rest is default. I get correct means and st.dev but I get this raw p-val 0.26480287 (adj p-val is the same). In excel it is: 0.279720154 (two-tailed, unequal variance t-test). First I assumed that Excel is wrong but here is what I get in R:
t.test(c(179.7673133,109.2865837,155.4898853),c(109.6793997,100.727767,140.8898293))
result is:
Welch Two Sample t-test
data: c(179.7673133, 109.2865837, 155.4898853) and c(109.6793997, 100.727767, 140.8898293)
t = 1.2957, df = 3.238, p-value = 0.2797
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-42.18235 104.34688
sample estimates:
mean of x mean of y
148.1813 117.0990
Another example:
GroupA: 416.8104479 371.4943213 414.9956691
GroupB: 263.326655 279.695355 215.8603682
MeV p-val: 0.008767734
Excel, R p-val: 0.004396191
Strange … What is the difference in t-test in Excel and R versus t-test in MeV? I could not get the same numbers with any settings of MeV.
Any idea?
Marek
Nejc
2012-07-17
The reason for this strange behavior is the implementation of Welch–Satterthwaite equation for determining approximate degree of freedom assuming unequal variances of samples.
There is a bug in the file Ttest.java in function computeDf(), where degree of freedom is computed as:
int df = (int)Math.floor(numerator / denom);
.
The result should not be rounded to the nearest lower integer, so more correctly it should read:
double df = numerator / denom;
.
However, it is not so simple to fix this issue, because one should first modify function TDistribution() from package JSci.maths.statistics.TDistribution to be able to accept degrees of freedom as doubles and not only integers.
The same bug existed also in older releases of MS Excel - see Statistical flaws in Excel (p. 14).
Hope this helps you a bit.
Nejc