t-test p-val discrepancy MeV vs R or Excel

Help
Anonymous
2010-12-19
2013-05-20

  • Anonymous
    2010-12-19

    Hi,
    For some reason MeV gives different p-values for t-test than both Excel and R. For example, I have these 6 numbers:
    GroupA: 179.7673133 109.2865837 155.4898853
    GroupB: 109.6793997 100.727767 140.8898293

    I load it to MeV (latest version but same results with older versions as well) as single color array. Then I select t-test and use "Between subjects" test with Welch approx. The rest is default. I get correct means and st.dev but I get this raw p-val 0.26480287 (adj p-val is the same). In excel it is: 0.279720154 (two-tailed, unequal variance t-test). First I assumed that Excel is wrong but here is what I get in R:

    t.test(c(179.7673133,109.2865837,155.4898853),c(109.6793997,100.727767,140.8898293))
    result is:

            Welch Two Sample t-test

    data:  c(179.7673133, 109.2865837, 155.4898853) and c(109.6793997, 100.727767, 140.8898293)
    t = 1.2957, df = 3.238, p-value = 0.2797
    alternative hypothesis: true difference in means is not equal to 0
    95 percent confidence interval:
    -42.18235 104.34688
    sample estimates:
    mean of x mean of y
    148.1813  117.0990

    Another example:
    GroupA: 416.8104479 371.4943213 414.9956691
    GroupB: 263.326655 279.695355 215.8603682

    MeV p-val: 0.008767734
    Excel, R p-val: 0.004396191

    Strange … What is the difference in t-test in Excel and R versus t-test in MeV? I could not get the same numbers with any settings of MeV.
    Any idea?
    Marek

     
  • Nejc
    Nejc
    2012-07-17

    The reason for this strange behavior is the implementation of Welch–Satterthwaite equation for determining approximate degree of freedom assuming unequal variances of samples.

    There is a bug in the file Ttest.java in function computeDf(), where degree of freedom is computed as:

    int df = (int)Math.floor(numerator / denom);
    

    .

    The result should not be rounded to the nearest lower integer, so more correctly it should read:

    double df = numerator / denom;
    

    .

    However, it is not so simple to fix this issue, because one should first modify function TDistribution() from package JSci.maths.statistics.TDistribution to be able to accept degrees of freedom as doubles and not only integers.

    The same bug existed also in older releases of MS Excel - see Statistical flaws in Excel  (p. 14).

    Hope this helps you a bit.
    Nejc