#955 xztfc does not converge for AARCH64

Developer_(v3.11.x)
closed-fixed
None
5
2014-09-17
2014-08-14
No

I am using 3.11.28 with redhat's patches to define aarch64. I am using a locally modified version of gcc (APM-6.0.4) 4.8.1.

I did a:
../configure -b 64 --with-netlib-lapack-tarfile=../../lapack-3.5.0.tgz -D c -DWALL

The xztfc step starts out fine, but eventually gets to the point where it is searching for larger and larger values of K, and never gets a speed up. Do you have any suggestions on what I should try?

tail of log:

TEST TA TB M N K alpha beta Time Mflop SpUp
==== == == === === === ===== ===== ===== ===== ====== ===== ====

42 T T 10 10 10 -1.0 0.0 1.0 0.0 12.49 820.8 1.00
42 T T 10 10 10 -1.0 0.0 1.0 0.0 11.78 870.2 1.06

TEST TA TB M N K alpha beta Time Mflop SpUp
==== == == === === === ===== ===== ===== ===== ====== ===== ====

43 N T 10 10 10 -1.0 0.0 1.0 0.0 11.48 893.3 1.00
43 N T 10 10 10 -1.0 0.0 1.0 0.0 14.78 693.8 0.78
44 N T 288 288 750 -1.0 0.0 1.0 0.0 3.03 3450.0 1.00
44 N T 288 288 750 -1.0 0.0 1.0 0.0 3.11 3364.1 0.98
45 N T 288 288 1500 -1.0 0.0 1.0 0.0 3.18 3442.9 1.00
45 N T 288 288 1500 -1.0 0.0 1.0 0.0 3.38 3243.2 0.94
46 N T 288 288 2250 -1.0 0.0 1.0 0.0 3.02 3464.1 1.00
46 N T 288 288 2250 -1.0 0.0 1.0 0.0 3.11 3361.8 0.97
47 N T 288 288 3000 -1.0 0.0 1.0 0.0 3.47 3444.9 1.00
47 N T 288 288 3000 -1.0 0.0 1.0 0.0 3.72 3209.0 0.93
48 N T 288 288 3750 -1.0 0.0 1.0 0.0 3.58 3470.6 1.00
48 N T 288 288 3750 -1.0 0.0 1.0 0.0 3.83 3246.6 0.94
49 N T 288 288 4500 -1.0 0.0 1.0 0.0 3.44 3470.9 1.00

Discussion

1 2 > >> (Page 1 of 2)
  • here is the log file

     
  • BTW, here is some information about the ARM naming convention.

    Technically, the name of the architecture is ARMv8 (ARM version 8). ARMv8 is an evolutionary step over ARMv7. It is mostly backward compatible. It adds a new, 64-bit instruction set to the two already existing instructions sets in v7. In v7, there is the ARM instruction set, and the Thumb instruction set. Both of these are 32-bit instruction sets. The v8 architecture adds a new instruction set. The ARM and Thumb instruction sets could interoperate. In a single branch, one could switch arbitrarily between ARM and Thumb. This is not true between the new instruction set and either of the old. The only way to switch is through a properly constructed IRET. This is a large divide. If you are operating on the ARM/Thumb side, you are said to be operating in Aarch32. If you are on the 64-bit instruction set side you are operating in Aarch64. Further, the Arm instruction set is now referred to as A32, and thumb as T32.

    As part of the base v8 architecture, you MUST implement not only floating point, but SIMD instructions as well.

    The people at redhat that have begun to patch ATLAS to support ARMv8 have chosen to name the ARCH enum to be AARCH64. I originally did not. I referred to it as ARMv8. I then chose the assembler to be gas_aarch64. I am not to hung up over the name either way. There would be NO reason to have both the 32 and 64 bit atlas libraries built on the same machine. One CANNOT interoperate between them. All user level code will be in either Aarch32 or Aarch64.

    Since they are out ahead of me, I have changed to their naming convention.

     
  • In the 3.11 series, the results of this test actually don't matter anymore, except when you want to compare new GEMM vs old GEMM. So, in 3.11 I think we can get away with simply hacking this. Try adding this to line 498 of ATLAS/tune/blas/gemm/tfc.c (this is first line of shape loop in DoShapes):
    #if defined(DCPLX)
    if (shape==ATLASMN_NB)
    n = 8000;
    else
    #endif
    I'm still messing a lot with threading in 3.11, so it may be pretty unstable. 3.10 is probably not a whole lot slower for ARM from what I recall, but its configure support sucks for ARM. However, configure may not enter into it, since you are having to hack ARMv8, where I don't have support in either package. Therefore, you might try 3.10 and see how it compares, particularly if you have errors with threaded in 3.11.

    I think on both versions performance will suck (compared to peak, not reference BLAS), because they won't have working assembly kernels, and in the past the compilers on early ARMs often produced code with errors, and not great performance even when it worked.

    As for red hat's patches, they spent seconds explaining them to me before just doing what they wanted, so their decisions will have no effect on what ATLAS eventually does. For the record, I really hate the name AARCH64. Why is ARCH in there? If it stands for ARM Architecture 64 bit, why isn't it called ARM64? It reminds me of Intel calling their new core Core. My blood pressure has still not normalized :)

    I'm still waiting to get access to 64-bit arm, so that I can get this officially added to ATLAS.

     
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,3 @@
    -
     I am using 3.11.28 with redhat's patches to define aarch64.  I am using a locally modified version of gcc (APM-6.0.4) 4.8.1.
    
    • assigned_to: R. Clint Whaley
     
  • Yes, the first A in AARCH64 is ARM.

    The reason it is not ARM64 is that
    1) ARM is the name 32-bit fixed length instruction set.
    2) When they added 64-bit support in v8, they wanted a name for both 32-bit architectures. They did not want to call ARM + Thumb = ARM32, thus AARCH32
    3) the flip side of AARCH32 is AARCH64.

    I guess it is less pompous/more creative than calling your architecture CoreXX

    Since it may not be your biggest architecture, perhaps ARMARCH64 would be more clear.

    I will try a 3.11.28 build with your suggesting, the follow it with your suggestion to use 3.10.

    The "locally modified" 4.8.1 compilers do pretty well at vectorizing scalar c code. We have seen significant increases in the spec int sub tests that can use SIMD. The AARCH64 requires SIMD, and they cleaned up a lot of the old junk from SIMD on AARCH32.

    I am very interested in getting thread support running stable, even if the kernels are less efficient than the should be.

     
  • What I object to is spelling out ARCH in the architecture name. By this logic, shouldn't we talk about IA32Arch, AMD64Arch, etc.? I mean, why is it there, am I missing something?

    Even if you are drunk and demand people refer to you as ClintSecondName, why do you abbreviate that CSecondName rather than Clint_sn, which would make at least some more sense to me, since you use the letters on the part that is unique to you, rather than wasting them on the part that is true for literally everyone in the entire world.

    From you have said, I'm guessing I'm likely to use something like ARMv8 as the arch string, but ATLAS always suffixes every ARCH with the ptr width, so that will become something like ARMv864, even though no ARMv832 exists. I'm always left grasping for the right approach to the incredibly complicated ARM echosystem, which is fairly hard to understand from an outsider's perspective. I got ATLAS ARM support as far as it is because of the help of folks like Tom Wallace of vesperix, that use it on ARM and are willing to explain things for me (Tom once held my hand to the extent of sending me a preformated SD card when I bricked my ARM develepment platform, so I could adapt 3.11 to the changed linux/gcc hard-fp ABI default).

    I've had very little luck with autovectorizers on generated GEMM kernels, which is why I'm predicting poor performance until I (or somebody with early machine access) gets a chance to extend ATLAS's SIMD generators to the new platform and/or write some assembly kernels.

    If you get 3.10 and 3.11 installed, I'd really appreciate some timing comparisons (I actually don't know which will be faster in serial, much less in parallel). All my threading development is currently on x86, which has strongly ordered caches. If you get the newest 3.11, be sure to add the following command to your configure:
    -D c -DATL_SLEEPTPOOL=1

    This should avoid some stuff that depends on the x86's strongly-ordered cache.

    If you get something working, and are interested in helping me understand the patches, we might be possible to put some basic stuff in the developer release even before I get access to a machine . . .

    Let me know,
    Clint

     
  • The 3.11.28 make build did not complete with you hack.

    It dies in xdlanbsrch with a segfault.

    make[4]: Leaving directory /home/dnuechte/ATLAS/build.8/bin' /usr/bin/aarch64-apm-linux-gnu-gfortran -O2 -ggdb -o xdlanbsrch_pt dlanbsrch_pt.o \ /home/dnuechte/ATLAS/build.8/lib/libtstatlas.a /home/dnuechte/ATLAS/build.8/lib/libptlapack.a /home/dnuechte/ATLAS/build.8/lib/libptcblas.a /home/dnuechte/ATLAS/build.8/lib/libptf77blas.a \ /home/dnuechte/ATLAS/build.8/lib/libatlas.a -lpthread -lm ./xdlanbsrch_pt -oc res/ATL_dtGetNB_geqrf maxN = 3960 *** Error in./xdlanbsrch_pt': free(): invalid pointer: 0x0000000012faf140
    Error in ./xdlanbsrch_pt': malloc(): memory corruption: 0x0000000012faf1d0 *** ^CERROR 1157 DURING LAPACK TUNE!!. CHECK INSTALL_LOG/dLATUNE.LOG FOR DETAILS. make[2]: Entering directory/home/dnuechte/ATLAS/build.8/bin'
    cd /home/dnuechte/ATLAS/build.8 ; make error_report
    make[3]: Entering directory /home/dnuechte/ATLAS/build.8' make -f Make.top error_report make[4]: Entering directory/home/dnuechte/ATLAS/build.8'
    uname -a 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
    /usr/bin/gcc -v 2>&1 >> bin/INSTALL_LOG/ERROR.LOG

     
    Attachments
  • You said:
    if (shape==ATLASMN_NB)

    I think you meant:
    if (shape==AtlasMN_NB)

    trying again with a new make build in the same build tree.

     
  • With that change xzftc compiles, but does not seem to be better. If my guess is correct, it is trying to see when a "second run" out performs a first. It does not seem like it ever does. I will let it continue to run, but this is what the log looks like so far:

    36 T T 21 21 48 -1.0 0.0 1.0 0.0 5.37 1908.9 1.00
    36 T T 21 21 48 -1.0 0.0 1.0 0.0 5.38 1905.8 1.00
    37 T T 27 27 48 -1.0 0.0 1.0 0.0 4.98 2057.2 1.00
    37 T T 27 27 48 -1.0 0.0 1.0 0.0 4.65 2204.8 1.07

    TEST TA TB M N K alpha beta Time Mflop SpUp
    ==== == == === === === ===== ===== ===== ===== ====== ===== ====

    38 T T 10 10 10 -1.0 0.0 1.0 0.0 12.49 820.9 1.00
    38 T T 10 10 10 -1.0 0.0 1.0 0.0 11.78 870.4 1.06

    TEST TA TB M N K alpha beta Time Mflop SpUp
    ==== == == === === === ===== ===== ===== ===== ====== ===== ====

    39 N T 10 10 10 -1.0 0.0 1.0 0.0 11.48 893.0 1.00
    39 N T 10 10 10 -1.0 0.0 1.0 0.0 14.80 692.6 0.78
    40 N T 288 288 750 -1.0 0.0 1.0 0.0 3.03 3448.0 1.00
    40 N T 288 288 750 -1.0 0.0 1.0 0.0 3.10 3372.0 0.98
    41 N T 288 288 1500 -1.0 0.0 1.0 0.0 3.19 3436.5 1.00
    41 N T 288 288 1500 -1.0 0.0 1.0 0.0 3.36 3261.2 0.95
    42 N T 288 288 2250 -1.0 0.0 1.0 0.0 3.04 3437.4 1.00
    42 N T 288 288 2250 -1.0 0.0 1.0 0.0 3.11 3365.0 0.98
    43 N T 288 288 3000 -1.0 0.0 1.0 0.0 3.47 3437.7 1.00
    43 N T 288 288 3000 -1.0 0.0 1.0 0.0 3.69 3238.0 0.94
    44 N T 288 288 3750 -1.0 0.0 1.0 0.0 3.59 3468.1 1.00
    44 N T 288 288 3750 -1.0 0.0 1.0 0.0 3.81 3266.9 0.94
    45 N T 288 288 4500 -1.0 0.0 1.0 0.0 3.45 3465.7 1.00
    45 N T 288 288 4500 -1.0 0.0 1.0 0.0 3.67 3251.7 0.94

     
  • the 3.10 run I started on slightly older hardware completed normally. Here are the results:

               single precision        double precision
            *********************    ********************
               real      complex       real      complex
    

    Benchmark % Clock % Clock % Clock % Clock
    ========= ========= ========= ========= =========
    kSelMM 153.6 156.2 145.2 142.0
    kGenMM 153.6 156.2 145.2 142.0
    kMM_NT 108.9 76.6 102.7 101.1
    kMM_TN 117.8 125.5 105.6 110.8
    BIG_MM 149.6 151.3 135.8 137.4
    kMV_N 72.2 88.8 42.1 71.0
    kMV_T 62.1 94.3 40.9 69.3
    kGER 70.7 85.9 35.9 65.6

     
  • hmm. include/atlas_tcacheedge.h was still just the touch. However, if I go to tune/blas/gemm and do a make res/atlas_tcacheedge.h I do get results which look pretty good:

    /home/dnuechte/ATLAS/build.1/bin/ATLrun.sh /home/dnuechte/ATLAS/build.1/tune/blas/gemm xdtfindCE -f res/atlas_tcacheedge.h
    TA TB M N K alpha beta CacheEdge TIME MFLOPS
    == == ====== ====== ====== ====== ====== ========= ========= ========

    T N 768 1968 5461 1.00 1.00 0 0.786 20992.77
    T N 768 1968 5461 1.00 1.00 32 -2.000 0.00
    T N 768 1968 5461 1.00 1.00 64 0.751 21970.26
    T N 768 1968 5461 1.00 1.00 128 0.712 23177.49
    T N 768 1968 5461 1.00 1.00 256 0.706 23393.38
    T N 768 1968 5461 1.00 1.00 512 0.712 23173.88
    T N 768 1968 5461 1.00 1.00 1024 0.731 22576.40
    T N 768 1968 5461 1.00 1.00 2048 0.760 21717.95
    T N 768 1968 5461 1.00 1.00 4096 0.780 21167.34
    T N 768 1968 5461 1.00 1.00 8192 0.795 20772.86

    Initial CE=256KB, mflop=23393.38

    T N 768 1968 5461 1.00 1.00 192 0.706 23392.82
    T N 768 1968 5461 1.00 1.00 384 0.706 23375.99

    Best CE=256KB, mflop=23393.38

     
  • Here is that 3.10 info, hopefully formatted better

                 single precision     double precision
              ******************* ********************
                   real   complex      real   complex
    Benchmark   % Clock   % Clock   % Clock   % Clock
    ========= ========= ========= ========= =========
    kSelMM        153.6     156.2     145.2     142.0
    kGenMM        153.6     156.2     145.2     142.0
    kMM_NT        108.9      76.6     102.7     101.1
    kMM_TN        117.8     125.5     105.6     110.8
    BIG_MM        149.6     151.3     135.8     137.4
    kMV_N          72.2      88.8      42.1      71.0
    kMV_T          62.1      94.3      40.9      69.3
    kGER           70.7      85.9      35.9      65.6
    
     
    Last edit: R. Clint Whaley 2014-08-16
  • OK, it turns out I have broken something in the old block-major DGEMM code, because if I override archdefs, I also never get crossover anymore. I have opened that problem up on another bug report:
    https://sourceforge.net/p/math-atlas/bugs/238/
    so we can keep separate from ARMv8 issues.

    I think I messed something up when I reversed a decision to remove the old DGEMM framework from the developer (I decided I wanted to keep around in order to do easy perf comparisons for a while longer).

    In the meantime, I attached a tfc.c that you can use to workaround this issue (it might lead to bad results due to the bug, but that probably won't affect your 3.11 performance regardless, since it should use access-major gemm anyway).

    After saving that file ATLAS/tune/blas/gemm/tfc.c, you can go to BLDdir/tune/blas/gemm, and issue

       make dRun_tfc
    

    To force a run for double precision. Just change for c, etc, to generate all the needed header files, before restarting the install (or just restarting install may work).

    Then we can see how 3.11 compares to 3.10.

    Thanks,
    Clint

     
    Attachments
  • OK, on your performance results, what is the FPU peak of this machine, both for scalar and vector operations?

    From the above results, if the compilers doing a decent job, I might guess it has 1 FMAC (floating point multiply and accumulate) unit, which gives a scalar performance maximum of 2 FLOPS/cycle. Is that right?

    If that is true, you are getting around 75% of peak for real timings, which is pretty typical for untuned scalar C code. Why your complex perf is so crappy is something we'd have to investigate.

    Of course, the perf is much worse if computed as % of vector peak, I'm guessing.

    Let me know,
    Clint

     
    • We have 1 fpu/simd per core
      8 cores per chip

      each fpu/simd unit has 32, 128-bit registers and a 64-bit datapath. The datapath has a latency of 5 cycles and can support either 2 single or 1 double precision MAC per cycle in a pipelined fashion.

      Using regular floating point instructions, the peak limit is only 1 MAC per cycle per core in either single or double precision.

      Using SIMD instructions, the peak rate for single precision doubles because the 64-bit datapath can start 2 MACs per cycle. It will "double pump" single floating point SIMD operations, and "quad pump" double precision operations. This means the first two results in single precision SIMD is ready in 5 cycles, the second two results are ready the next cycle. For doubles, the first is ready in 5, and then the rest are done in the 6th, 7th, and 8th cycles.

      We do not support extended precision in either regular or SIMD instructions.

      I am not yet using the autovectorizing compiler. I need to get a clean native build of glibc/binutils/gcc working before I can try it on my system.

      Should the performance numbers include the effects of multiple cores?

       
  • make check passes on 3.10
    make ptcheck passes on 3.10 except for the xXqrtst_pt tests. Each of those fails with one or more threshold tests.

     
    Last edit: Dave Nuechterlein 2014-08-18
  • OK, then your perf results make sense, other than the complex drop, which I don't yet understand.

    On 3.10, why don't you post the output of the failing tests?

    For 3.11, have you got an install working with the provided patch?

     
  • I've been working on getting the locally modified compiler built in my environment. It is really unpleasant since it also needs an upgraded glibc.

    I'm not sure of the best way to get the 3.10 results. When I run make pt_check, these are the fails from ptsanity.out

    case d
    Rt Maj M N lda TIME MFLOP RESIDUAL
    == === ===== ===== ===== ========== ========== =========
    QR Col 517 477 517 3.1658e-02 5161.01 1.82e-02
    QL Col 517 477 517 3.0407e-02 5373.34 6.16e+11
    RQ Col 517 477 517 2.9576e-02 5524.98 8.05e+11
    LQ Col 517 477 517 2.9312e-02 5574.74 1.80e-02

    4 cases ran, 2 cases failed, 2 cases passed

    case c
    Rt Maj M N lda TIME MFLOP RESIDUAL
    == === ===== ===== ===== ========== ========== =========
    QR Col 517 477 517 1.2237e-01 5344.77 4.69e+02
    QL Col 517 477 517 1.2109e-01 5401.27 4.81e+02
    RQ Col 517 477 517 1.0048e-01 6509.67 2.56e-02
    LQ Col 517 477 517 1.0053e-01 6506.11 2.30e+03

    case z
    4 cases ran, 3 cases failed, 1 cases passed
    Rt Maj M N lda TIME MFLOP RESIDUAL
    == === ===== ===== ===== ========== ========== =========
    QR Col 517 477 517 6.9378e-02 9427.26 2.10e-02
    QL Col 517 477 517 6.7426e-02 9700.18 2.55e+09
    RQ Col 517 477 517 6.6878e-02 9780.28 2.20e-02
    LQ Col 517 477 517 6.7039e-02 9756.79 1.28e+12

    4 cases ran, 2 cases failed, 2 cases passed

     
  • results with 3.11.28 with old 4.8.1 compiler with patch.

    In my existing build directory, make dRun_tfc did converge.

    Doing a new build did complete.
    ../configure --prefix=/home/dnuechte -b 64 -D c -DWALL -v 3 --with-netlib-lapack-tarfile=../../lapack-3.5.0.tg

              single precision        double precision
            *********************    ********************
               real      complex       real      complex
    

    Benchmark % Clock % Clock % Clock % Clock
    ========= ========= ========= ========= =========
    kSelMM 155.9 159.9 144.2 148.1
    kGenMM 155.9 159.9 144.2 148.1
    kMM_NT 69.4 87.6 56.2 63.3
    kMM_TN 83.5 99.3 77.7 104.6
    BIG_MM 155.3 156.0 141.5 144.0
    kMV_N 87.5 107.6 52.4 86.0
    kMV_T 71.6 113.3 51.1 94.4
    kGER 83.1 90.5 49.3 77.3

    make check passed.
    make ptcheck fails.
    make ssanity_test_pt passes
    make dsanity_test_pt fails in /home/dnuechte/ATLAS/build.9/ATLrun.sh /home/dnuechte/ATLAS/build.9/bin xdslvtst_pt -n 477 -r 517 -O 2 c r \ >> /home/dnuechte/ATLAS/build.9/bin/ptsanity.out
    RHS=1, nrm=2131725570074374.000000
    RHS=2, nrm=5757786549118771.000000
    RHS=3, nrm=15348469358828296.000000...
    make csanity_test_pt passes
    /home/dnuechte/ATLAS/build.9/ATLrun.sh /home/dnuechte/ATLAS/build.9/bin xzslvtst_pt -n 477 -r 517 -O 2 c r \ >> /home/dnuechte/ATLAS/build.9/bin/ptsanity.out
    RHS=1, nrm=551752169811111.312500
    RHS=2, nrm=659379883670181.250000
    RHS=3, nrm=390638824770595.125000
    RHS=4, nrm=547091966786444.875000

     
  • Here are the 3.11.28 results better formatted:

               single precision        double precision
             *********************    ********************
                 real      complex       real      complex
    Benchmark  % Clock %  Clock      % Clock    % Clock
    =========  =========  =========  =========  =========
    kSelMM         155.9      159.9      144.2      148.1
    kGenMM         155.9      159.9      144.2      148.1
    kMM_NT          69.4       87.6       56.2       63.3
    kMM_TN          83.5       99.3       77.7      104.6
    BIG_MM         155.3      156.0      141.5      144.0
    kMV_N           87.5      107.6       52.4       86.0
    kMV_T           71.6      113.3       51.1       94.4
    kGER            83.1       90.5       49.3       77.3
    
     
  • Well, the good news is that 3.11 is at least as fast as 3.10, which I was not confident it would be.

    Now for the errors in both 3.10 and 3.11, I suspect they are caused by compiler errors. I had access briefly to one of the new fast 32-bit ARM chips, and even there, gcc gave errors.

    One way to test this idea, is to try the following in one of your builds:
    1. edit your Make.inc, and for every compiler flags macro, suffix them with -O0
    2. touch ATLAS/include/ BLDdir/include/
    3. make check in BLDdir

    I think this will force a rebuild w/o opt of most of the lib (not everything, unfortunately, but should be most of it). If the errors go away, almost for sure a compiler error.

     
  • good news bad news. I tried a 3.11.29 build with my AARCH64 changes and the change to tfc.c. It converges just fine. However, dlatune does not complete. I think this bug has outlived its usefulness, and can be closed when the patched tfc.c is put in the code base if the new error is unrelated to the old:

    The dlatune failure is in xdlanbsrch_pt:
    /usr/bin/aarch64-apm-linux-gnu-gfortran -O2 -ggdb -o xdlanbsrch_pt dlanbsrch_pt.o \ /home/dnuechte/ATLAS/build.1/lib/libtstatlas.a /home/dnuechte/ATLAS/build.1/lib/libptlapack.a /home/dnuechte/ATLAS/build.1/lib/libptcblas.a /home/dnuechte/ATLAS/build.1/lib/libptf77blas.a \ /home/dnuechte/ATLAS/build.1/lib/libatlas.a -lpthread -lm
    ./xdlanbsrch_pt -oc res/ATL_dtGetNB_geqrf
    maxN = 3960
    Error in ./xdlanbsrch_pt': free(): invalid pointer: 0x000000001eeb8140 *** *** Error in./xdlanbsrch_pt': malloc(): memory corruption: 0x000000001eeb8410

    I am going to turn down the compiler optimization and see if this problem goes away.

     
  • Forgot to add the attachment.

     
  • I was finally able to work on the 3.10 tests again. This build had 0 fails in make check, and 3 fails in make ptcheck all in the slv tests. I did a make full_test. That ran the better part of a day. It ran really well, and only had 3 failures. Again, they were in slvtsts.

    I then followed your advice with the touches, and reran the make check and the make ptcheck. All of the tests pass. How do I now track down the compiler bug, any suggestions?

     
    Attachments
  • I was finally able to work on the 3.10 tests again. This build had 0 fails in make check, and 3 fails in make ptcheck all in the slv tests. I did a make full_test. That ran the better part of a day. It ran really well, and only had 3 failures. Again, they were in slvtsts.

    I then followed your advice with the touches, and reran the make check and the make ptcheck. All of the tests pass. How do I now track down the compiler bug, any suggestions?

     
    Attachments
1 2 > >> (Page 1 of 2)