NCO netCDF Operators / Discussion / Developers: ESMF batch benchmarks for 2,4, 8 threads

Hi,

The following benchmarks come from runs in ESMF batch queue rg8.
These were run by changing NTHREADS in nco_bm.sh and then submitting

llsubmit nco_bm.sh

The idea of running in the batch queue is to guarantee proceesor
availability and to minimize benchmark variability.
Yet, as you will see, variability can still be quite high, e.g., ncpdq.
Why is there any significant variability in queues?

The benchmarks are for 2, 4, and 8 threads, each repeated twice.
Some operators will require more (four? eight?) tests to get a
statistically meaningful result.
I'd like to understand why variability is so high so that we can
reduce it if possible and therefore get away with fewer repetitions.

ncwa seems to scale well, as expected, though results are less clear
for the remaining threading operators.
Probably next week I will discuss with Harry revising the benchmarks
to better highlight certain operations, and to provide more concise
diagnostics of information nco_bm.pl already measures.
If you have suggestions for benchmark revisions, please post them
soon, as we do not want to change the benchmarks very often.

Thanks,
Charlie

     Test   Success    Failure   Total       Time   (OMP threads = 8)
      ncap:        6          4      10     279.5215
   ncatted:        1                  1     0.5853
      ncbo:        8                  8     271.6180
   ncflint:        3                  3     0.8077
      ncea:        6                  6     99.4940
    ncecat:        1                  1     0.2696
      ncks:       15                 15     1.8946
     ncpdq:        6                  6     454.3183
      ncra:       15          2      17     109.3245
      ncwa:       37                 37     79.4942

     Test   Success    Failure   Total       Time   (OMP threads = 8)
      ncap:        6          4      10     265.7919
   ncatted:        1                  1     0.5488
      ncbo:        8                  8     260.2727
   ncflint:        3                  3     0.7674
      ncea:        6                  6     86.4602
    ncecat:        1                  1     0.2761
      ncks:       15                 15     1.8597
     ncpdq:        6                  6     358.6496
      ncra:       15          2      17     109.3547
      ncwa:       37                 37     77.3682

loadleveler:
     Test   Success    Failure   Total       Time   (OMP threads = 4)
      ncap:        6          4      10     354.6031
   ncatted:        1                  1     0.5587
      ncbo:        8                  8     273.5378
   ncflint:        3                  3     0.7771
      ncea:        6                  6     109.0394
    ncecat:        1                  1     0.2613
      ncks:       15                 15     1.9057
     ncpdq:        6                  6     660.7932
      ncra:       15          2      17     135.5798
      ncwa:       37                 37     130.6776

loadleveler:
     Test   Success    Failure   Total       Time   (OMP threads = 4)
      ncap:        6          4      10     312.1353
   ncatted:        1                  1     0.6110
      ncbo:        8                  8     262.2002
   ncflint:        3                  3     0.6750
      ncea:        6                  6     94.6699
    ncecat:        1                  1     0.3568
      ncks:       15                 15     1.9946
     ncpdq:        6                  6     461.3167
      ncra:       15          2      17     136.2394
      ncwa:       37                 37     121.1632

     Test   Success    Failure   Total       Time   (OMP threads = 2)
      ncap:        6          4      10     381.3692
   ncatted:        1                  1     0.5680
      ncbo:        8                  8     234.7483
   ncflint:        3                  3     0.6513
      ncea:        6                  6     117.4369
    ncecat:        1                  1     0.2669
      ncks:       15                 15     1.7837
     ncpdq:        6                  6     509.1903
      ncra:       15          2      17     153.3728
      ncwa:       37                 37     156.7412

     Test   Success    Failure   Total       Time   (OMP threads = 2)
      ncap:        6          4      10     359.3890
   ncatted:        1                  1     0.5667
      ncbo:        8                  8     224.5704
   ncflint:        3                  3     0.6571
      ncea:        6                  6     103.7265
    ncecat:        1                  1     0.2788
      ncks:       15                 15     1.9143
     ncpdq:        6                  6     480.8395
      ncra:       15          2      17     151.8112
      ncwa:       37                 37     156.5773

ESMF batch benchmarks for 2,4, 8 threads

Command-line operators for netCDF and HDF files

Forums

Help

ESMF batch benchmarks for 2,4, 8 threads

ESMF batch benchmarks for 2,4, 8 threads

Command-line operators for netCDF and HDF files

Forums

Help

ESMF batch benchmarks for 2,4, 8 threads document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

ESMF batch benchmarks for 2,4, 8 threads