Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


#932 Difference in Drotmg implementation with the Netlib reference implementation

Brendan Tracey

I noticed that the Drotmg implementation is different than the Netlib reference implementation. When d1 <= rgamsq, there is a do-while loop controlling how many times to multiply by gamma on line 109 of ATL_rotmg.c. The while loop states "while (d1 <= gamsq)"; d1 will be multiplied until it is greater than gamsq. However, in the netlib reference implementation [1], the while loop is "while (d1 <= rgamsq || d1 > gamsq)"; d1 will be multiplied until it is greater than rgamsq, not gamsq. Is this behavioral difference an error in the Netlib reference implementation or an error in ATLAS?

Thank you very much


    • assigned_to: R. Clint Whaley
  • It appears to me it is a versioning problem. It looks like someone completely rewrote the ROTMG that appears in LAPACK between 3.2.1 and 3.4.1.

    As far as I know, if there is an operational difference between 3.2.1 and 3.4.1, then it would be an error in LAPACK, since the BLAS standard has not been amended.

    However, they may have made changes (without telling me, at least) to improve stability or something, without amending the standard (??).

    I'm not sure where in the lapack code you are comparing to. Can you scope lapack 3.2.1's DROTMG and see if you think it matches ATLAS? If it does, then the discrepency seems to come from lapack, and I think you need to take this up with the lapack group (and please CC me!).

    I won't have time to do any ATLAS work for a month or two, so hopefully you can figure something out! At any rate, thank you very much for reporting this discrepency, and keep me updated on anything you figure out.

    Many thanks,

  • Brendan Tracey
    Brendan Tracey

    The 3.2.1 Fortran code is much harder to read, but I'm pretty sure it is consistent with the current Netlib implementation and not the ATLAS one.

    Lines 142 to 152 of drotmg.f are the following:

    110 CONTINUE
    IF (DD1.EQ.ZERO) GO TO 160
    GO TO 70
    120 CONTINUE
    DD1 = DD1
    DX1 = DX1/GAM
    DH11 = DH11/GAM
    DH12 = DH12/GAM
    GO TO 110

    To me, this seems to translate to "while dd1 <= rgamsq", and not the "while d1 <= gamsq" that is line 109 of ATL_rotmg.c. Note that the behavior of the ATLAS code is not consistent with itself. The first case, d1 <= rgamsq, loops until d1 > gamsq (not rgamsq). In contrast, the case where d1 >= gamsq loops until d1 < gamsq, when d2 <= rgamsq it loops until d2 > rgamsq, and when d2 >= gamsq, it loops until d2 < gamsq. As a result, I am inclined to believe that this is an ATLAS bug, and line 109 of ATL_rotmg.c should read "while(d1 <= rgamsq)" (subbing rgamsq from gamsq).

  • Do you have a test case that ATLAS fails and netlib passes, or did you find this purely by inspection?

  • If you could find a test case where ATLAS and netlib differ, it would be a big help. From a cursory examination, it looks like I have split a loop for performance reasons to cause the differences you are seeing. I would have to do a very detailed analysis to re-prove that this loop fission is legal, which I don't have time for now (I did this analysis and obviously thought it was legal before the rewrite back in '99). If you can find any case where ATLAS and netlib differ, then that would be proof that something I did was illegal. Since rotmg has worked for this long, I tend to hope my original rewrite was OK.

  • I have tested BLAS bundled with LAPACKs 3.2.1 and 3.5.0 against tests in LAPACK 3.5.0. All tests passed.
    I tested ATLAS 3.8.4 against the same tests and SROTMG/DROTMG failed. Other tests were ok.

    Here are the results:
    Test of subprogram number 11 DROTMG


    11 5 9999 9999 1 0.21250649D+13 0.75497472D-02 0.2125D+13 0.7550D-02
    11 5 9999 9999 3 0.19402554D-05 0.32552083D+02 -0.3255D+02 0.3255D+02
    11 5 9999 9999 6 0.14551915D-10 0.24414062D-03 -0.2441D-03 0.2441D-03
    11 5 9999 9999 8 0.48506384D-07 0.81380208D+00 -0.8138D+00 0.8138D+00

    Last edit: Vadim Zborovskii 2014-03-24
  • Note to self:
    Need to reproduce this using lapack3.5.0 at home.

    • status: open --> open-invalid
  • 3.8 is no longer supported. Do you get this behavior with 3.10?

    I run the BLAS tester with every install, and have never seen it. What exact test is failing?


  • Brendan Tracey
    Brendan Tracey


    Sorry for the really late reply. Sourceforge just alerted me that there had been any responses to this thread at all. It's one of the tests in


    (Drotmg tests start on 1317). I'm working now to recompile and link in ATLAS to this test suite, and I should get back to you shortly.

  • What is the github link to? Assuming that test fails, for the same install, does "make check" from ATLAS BLDdir pass? "make check" runs the standard BLAS tester, in addition to ATLAS's own tests.

    If the github test is not the API tester, how can I be sure the error isn't in the test itself?


  • Brendan Tracey
    Brendan Tracey

    I believe it is a bug in ATLAS because OpenBLAS and OSX Accelerate framework both pass the test. I have also showed you the line in the ATLAS implementation which disagrees with the reference implementation.

    Here are the specific tests and the output from our test suite:
    Name: "RD1_Med_RD2_Small_Flag_1",
    P: &blas.DrotmParams{
    Flag: blas.Rescaling,
    H: [4]float64{0.0002197265625, -1, 0.000244140625, 3.375e-11},
    D1: 1.2,
    D2: 0.000000000045,
    X1: 2.7,
    Y1: 80000000000,
    Rd1: 0.0007549747199770676,
    Rd2: 1.19999999996355,
    Rx1: 1.9531250000593264e+07,

    Name: "D1_Small_D2_Big_Flag_1",
    P: &blas.DrotmParams{
    Flag: blas.Rescaling,
    H: [4]float64{2.3731773997569866e+10, -1.6777216e+07, 0.000244140625, 1.6777216e-07},
    D1: 120000000000000000,
    D2: 0.000000000012345,
    X1: 0.08,
    Y1: 8000000000000,
    Rd1: 0.00010502490698765249,
    Rd2: 216.1836123957717,
    Rx1: 3.8516669198055897e+09,

    Name: "RD1_Small_RD2_Med_Flag_0",
    P: &blas.DrotmParams{
    Flag: blas.Rescaling,
    H: [4]float64{0.000244140625, -1e-08, 0.24414062499999997, 1},
    D1: 0.0000000002,
    D2: 20,
    X1: 0.8,
    Y1: 0.000000008,
    Rd1: 0.003355409645903541,
    Rd2: 19.99980000199998,
    Rx1: 0.000195314453125,
    Name: "RD1_Small_RD2_Med_Flag_1",
    P: &blas.DrotmParams{
    Flag: blas.Rescaling,
    H: [4]float64{0.0012207031250000002, -1, 0.000244140625, 1e-09},
    D1: 0.02,
    D2: 0.000000000004,
    X1: 0.008,
    Y1: 8000000,
    Rd1: 6.710886366445568e-05,
    Rd2: 0.019999999900000003,
    Rx1: 1953.125009765625,


    --- FAIL: TestDrotmg (0.00 seconds)
    level1double.go:1622: drotmg H mismatch RD1_Med_RD2_Small_Flag_1: expected [0.0002197265625 -1 0.000244140625 3.375e-11], found [1.3096723705530167e-11 -1 1.4551915228366852e-11 3.375e-11]
    level1double.go:1627: drotmg rd1 mismatch RD1_Med_RD2_Small_Flag_1: expected 0.0007549747199770676, found 2.1250649172267914e+11
    level1double.go:1633: drotmg rx1 mismatch RD1_Med_RD2_Small_Flag_1: expected 1.9531250000593264e+07, found 1.1641532183047094
    level1double.go:1622: drotmg H mismatch D1_Small_D2_Big_Flag_1: expected [2.3731773997569866e+10 -1.6777216e+07 0.000244140625 1.6777216e-07], found [1414.5239590150038 -1.6777216e+07 1.4551915228366852e-11 1.6777216e-07]
    level1double.go:1627: drotmg rd1 mismatch D1_Small_D2_Big_Flag_1: expected 0.00010502490698765249, found 2.9561883248388298e+10
    level1double.go:1633: drotmg rx1 mismatch D1_Small_D2_Big_Flag_1: expected 3.8516669198055897e+09, found 229.57723854813514
    level1double.go:1622: drotmg H mismatch RD1_Small_RD2_Med_Flag_0: expected [0.000244140625 -1e-08 0.24414062499999997 1], found [1.4551915228366852e-11 -1e-08 1.455191522836685e-08 1]
    level1double.go:1627: drotmg rd1 mismatch RD1_Small_RD2_Med_Flag_0: expected 0.003355409645903541, found 9.444638519354097e+11
    level1double.go:1633: drotmg rx1 mismatch RD1_Small_RD2_Med_Flag_0: expected 0.000195314453125, found 1.1641648598015309e-11
    level1double.go:1622: drotmg H mismatch RD1_Small_RD2_Med_Flag_1: expected [0.0012207031250000002 -1 0.000244140625 1e-09], found [7.275957614183427e-11 -1 1.4551915228366852e-11 1e-09]
    level1double.go:1627: drotmg rd1 mismatch RD1_Small_RD2_Med_Flag_1: expected 6.710886366445568e-05, found 1.888946583703125e+10
    level1double.go:1633: drotmg rx1 mismatch RD1_Small_RD2_Med_Flag_1: expected 1953.125009765625, found 0.00011641532240901142