Menu

Multi-Core computations with IT++ on AMD?

Frank NN
2008-02-29
2012-09-15
  • Frank NN

    Frank NN - 2008-02-29

    Has anybody managed to utilize the 2 cores of a dual-core AMD64 CPU, without adapting the code? I get only 50% CPU load with ITPP simulations, just one CPU core.

    I wonder if the ACML-MP or MKL-MP variants can do the parallization for long calculations (FFT, solvers) internally. Or does the "multi-core support" in the features annoucement mean, I have to do this manually by multi-thread coding?

    Similar disappointments I had in MATLAB. They have the preference-option "enable multithreaded computation" on multiple cores. But it does not work. Just 50% overall CPU load.

     
    • Adam Piątyszek

      Adam Piątyszek - 2008-02-29

      This is the question to ACML or MKL support.
      IT++ is written in a single threaded manner, although I agree that it would be nice to have some support for openMP parallelism included.

      Personally I use MPI when I need to write a parallel simulator suited to my needs, which can be run on a cluster of dozens of nodes.

      /Adam

       
    • Frank NN

      Frank NN - 2008-02-29

      I'm not familiar with the OpenMP and MPI concepts. Cluster computing is another issue. At the moment I'm thinking about the 50% unused dual-core capacity. I'm just wondering, if it is possible, with existing MATLAB-like IT++ code, to do some automatic parrallelism. With a simple re-compile with the appropriate MP library. I.e. distribute a large FFT symmetrically over 2 cores. I hoped that the MP-variants of ACML and MKL will do this, but apparently not (MP = just tread safe variant). If it's not possible, Ok, then I will try to do this by manually creating 2 FFT threads. I will ask in the ACML Forum.

       
    • Frank NN

      Frank NN - 2008-03-02

      I checked the MKL 10 documents. They say, even if the user (or IT++) code itself is not multithreaded or thread-safe, the MKL library will do multi-core parallelization for some functions like FFT automatically (internal MKL computing threads). ACML 4 doc: "Furthermore, key LAPACK routines have been treated using OpenMP to take advantage of multiple processors when running on SMP machines. Your application will automatically benefit when you link with the
      OpenMP versions of ACML." I saw some OpenMP examples, where they came from 50% CPU load to 100% on Dualcore-CPU's just by introducing OpenMP, without changing the algoritm ("for" loops) at all. The execution time for this loop test was just half of the non-MP-compiled version.

      For ACML 3.6 GCC on Windows they don't offer OpenMP. So it would be very interesting to have ACML 4 support on Windows in future versions of IT++ (solving the name mangling problem).

       
    • Frank NN

      Frank NN - 2008-03-03

      It works. OpenMP parallelism is great. However, the compiler should be enabled. A new GCC 4.2.3 for Cygwin is made like this (watch for "libgomp", this enables OpenMP):

      ../gcc-4.2.3/configure --disable-nls --enable-threads=posix --enable-libgomp --with-x --enable-java-awt=gtk,xlib --without-included-gettext --enable-version-specific-runtime-libs --with-system-zlib --disable-win32-registry --enable-sjlj-exceptions --enable-hash-synchronization --enable-libstdcxx-debug

      GCC can call the ACML 4 DLL Intel Fortran version without problem. I got matrix multiplication in ACML 4 working on 2 cores, without changing the code, just by compiling with the "-fopenmp" switch. FFT I got in parallel with an #pragma switch in my FFT loop.

      So now let's hope that IT++ will support ACML4 soon.

       
      • Adam Piątyszek

        Adam Piątyszek - 2008-03-04

        Hi Frank,

        > So now let's hope that IT++ will support ACML4 soon.

        In fact, it already does but on Linux only.

        Besides, you should not expect having the support for another platforms automatically included in IT++. IT++ is an open-source library licensed under the GNU GPL license. This practically means that there are no people behind IT++, who are paid for working on it. Therefore, unless you or other users provide a ready (or almost) ready to use solution, which can be easily incorporated into IT++ without braking other things, you may only dream that "IT++ will support ACML4 (built with Intel Fortran) soon".

        If you think differently, you can always try to contact any of the IT++ developers directly and offer him some gratification for the particular work you would like him to do. But this does not guarantee that the things you request will be accepted.

        Sorry, but this is how open-source model works for most projects.

        BR,
        /Adam

         
    • Bogdan Cristea

      Bogdan Cristea - 2008-03-04

      Hi Frank
      Could you provide a small example of your matrix multiplication program using two cores. I have not succeed to do the same thing on Linux and I am interested in this topic.
      regards
      Bogdan

       
    • Frank NN

      Frank NN - 2008-03-04

      Yes, I know, I had it run with ACML 4 on Linux already. This is the most elegant way. But Windows is not a minority platform. Maybe Cygwin, but I suppose Visual-Studio users will run into the same problem, since func_ is demanded but only FUNC provided by the Lib.

      I tried to correct it manually by

      define fortranfunct_ FORTRANFUNCT

      But it needs some architecual changes in the autoconf and the switches between the libraries. But before changing, it needs to understand the problem, also with other compilers from other users. It does not help if you delete the bug report.

       
      • Adam Piątyszek

        Adam Piątyszek - 2008-03-04

        Frank,

        To me it is not a bug but a lack of a particular feature. And as it is clearly stated (in red) in the bug submission page that the bug tracker is only for confirmed bugs in IT++. The Help forum is for discussing problems, missing functionality, etc. Therefore, I had to remove your report from the bug tracker. Sorry!

        If you really have something to contribute in this area, you can open a new Feature Request ticket and there attach some patches, etc.

        But please to not expect that someone else will immediately start working on this issue, just because you would like to see it in IT++.

        BR,
        /Adam

         
    • Frank NN

      Frank NN - 2008-03-04

      The parallel matrix multiplication was from the examples directory in the ACML. Just call
      make OMP_NUM_THREADS=2

      Parallel FFT does not seem to work automatically. But I made a FFT loop in C, with a #pragma in the source. This also was running the loop partitions in parallel.

      You need the GCC switch -fopenmp and a GCC which does support it, see above.

       
    • Frank NN

      Frank NN - 2008-03-04

      I don't understand the autoconf scripting. No interest from anybody to make IT++ ACML4-compatible on the Windows platform? Before taking any action, the problem has to be discussed in deep, to understand the linker problems.

      If you're interested, this is how I made the FFT loop working in parallel:

      pragma omp parallel

      for(j=0; j<ntimes; j++)
      cfft1d(0,n,x,comm,&info);
      #pragma omp barrier


      It's running with the ACML4 (IFORT32 DLL), compiled with Cygwin/GCC 4.2.3, both on Athlon64 and Intel CoreDuo (I don't have the MKL). Note, that the cff1d function interface is calling the C variant. The Fortran variant would be CFFT1D. IT++ is calling Fortran interfaces, except for the FFT functions. Run time went down from 50sec single-core to 30sec dual-core (on Pentium). This was without IT++. I think the IT++ code does not have to change for a similar loop in IT++. The user can do the #pragma in an IT++ loop.

      A single big FFT 1D does not run in parallel without the #pragma, while the DGEMM does.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.