Menu

AMD Ryzen 9 7950X

Elk Users
Markus
2023-03-02
2023-03-15
  • Markus

    Markus - 2023-03-02

    Dear elkies,
    here I just want to quickly share some (unscientific) benchmarks of the Ryzen 9 7950X CPU with HPL and elk. TLDR: This CPU is a beast!

    I link against the latest AOCL 4.0, using
    F90_LIB = -lblis -lflame -lfftw3 -lfftw3f

    HPL gives me an optimized 910 GFLOP/s, however limited by the small 32 GB RAM in my system. Note that SMT should be disabled and we run 16 MPI processes or 16 threads.

    Now let's throw in elk 8.7.10. We do the basic/HfSiO4 example, but to give the CPU some work, we increase ngridk to 8 8 8 and set spinorb .true.

    Runtime is 2m24s and 2077 CPU seconds. This is really good performance. Previously I was an Intel user because of the lack of efficient linear algebra libraries for AMD platforms a few years ago. That has changed now.

    For comparison, I ran the exact same test on an old dual socket Intel Xeon E5-2697A v4 system with 32 cores in total, linked to MKL. Runtime is 3m37s and CPU time is 5789 s.

    Some quick math:
    The 7950X has 16 cores, which boost to 5.1 GHz during the tests (16x5.1GHz = 81.6E9 CPU cycles / s). The Xeon system has 32 cores boosting to 2.9 GHz (32x2.9GHz = 92.8E9 CPU cycles / s). The Xeon system has AVX2, the Ryzen has AVX512. And this could make all the difference, the CPU is roughly 2.5x more efficient (theoretical 2x gain due to AVX512, and some improvements in instructions-per-cycle).

    So if someone is looking for a relatively cheap compute workstation, the 7950X is a clear recommendation. The performance available for less than USD 2000 for a full system is ridiculous.

    N.B.: Compiling with AOCC 4.0 makes no difference w.r.t. gcc 11.3 for elk, so not worth it in my tests. Switching to -march=znver3 (thereby disabling AVX512) makes also virtually no difference, the gains seem to come mostly from AOCL (the linear algebra and FFTW library), which still has AVX512 enabled.

    Cheers,
    Markus

     
  • Zhiwei Li

    Zhiwei Li - 2023-03-03

    Dear Markus,

    Very interesting post.
    It would be great if you can post your make.inc file or if you have time to write a command by command explanation on how to compile elk from a pristine system, say Ubuntu for example, including prepare all the needed libaries which can be very over welcoming for a begainer.

    Best,
    Zhiwei

     
    • Markus

      Markus - 2023-03-08

      You can find one now in the Elk Users forum.

       
  • Vitaliy Romaka

    Vitaliy Romaka - 2023-03-05

    Markus, could you please post your make file?

     
  • Markus

    Markus - 2023-03-06

    Sure, here's a minimal make.inc for gfortran. I've stripped off all the comments etc. This is a make.inc for the single-threaded version of blis, which is fine for a workstation. Except maybe for diagonalizing the BSE hamiltonian or running molecules in a large vacuum cell, where you don't have k-point parallelism; here you would want to link the multi-threaded library and create a separate executable for only these tasks.

    MAKE = make
    AR = ar
    
    SRC_MKL = mkl_stub.f90
    SRC_OBLAS = oblas_stub.f90
    SRC_BLIS = blis_stub.f90
    SRC_MPI = mpi_stub.f90
    SRC_FFT = zfftifc_fftw.f90 cfftifc_fftw.f90
    SRC_LIBXC = libxcifc_stub.f90
    SRC_W90S = w90_stub.f90
    
    F90 = gfortran
    F90_OPTS = -Ofast -march=native -fomit-frame-pointer -fopenmp -ffpe-summary=none -fallow-argument-mismatch
    F90_LIB = -lblis -lflame -lfftw3 -lfftw3f
    

    Remember to install AOCL 4.0 according to the installation manual, e.g. in /home/youruser/amd (that's just a simple install in the userspace) and run

    source ~/amd/aocl/4.0/amd-libs.cfg
    

    or whatever the path to your amd-libs.cfg is. The installer tells you the path after installation. You have to run this on every terminal that you use for compilation (the linker won't find the libs otherwise) and for actually running the code. Add this line to your .bashrc if you want to do this automatically.

    This is compiled without MPI, because the OpenMP parallelism works extremely well in elk. However, if you want MPI you can just install the openmpi dev packages, remove the SRC_MPI line and replace gfortran with mpif90.

     
    👍
    1
  • Markus

    Markus - 2023-03-06

    By the way.... I'm testing the machine with Ubuntu 22.04.2 LTS' openblas / lapack / fftw3 packages. Indeed, this is pretty performant and runs the test described above in 2m17s when setting OPENBLAS_NUM_THREADS=1. But please note that other tasks might work better on AOCL, depends certainly on the usage of certain routines where AOCL or OpenBLAS+FFTW3 outperform, respectively.

    But embarrassingly for AMD, HPL spits out an optimized 944 GFLOP/s with Ubuntu's openblas. So it seems that running elk with the open source libraries that come with, e.g., Ubuntu is just fine.

     
  • Markus

    Markus - 2023-03-15

    Now I tried compiling elk with the latest Intel OneAPI compiler and linking against the latest MKL. Our test above runs at 2m23s (with a single MKL thread), so very close to the result with gcc/AOCL. But still a bit worse than gcc + OpenBLAS.

    Warning: Do not use -xHost! This produces a super-slow executable. Instead, use the following line for ifort: F90_OPTS = -O3 -march=skylake-avx512 -ipo -qopenmp -qmkl=sequential

    The problems with a slow codepath for MKL on AMD that were reported in the past seem to be resolved now. MKL performs reasonably well on my 7950X.

    Best regards,
    Markus

     
    👍
    1

Log in to post a comment.