James Dinan Stephen Olivier

The Unbalanced Tree Search Benchmark

The Unbalanced Tree Search (UTS) benchmark is a parallel benchmarking code that reports the performance achieved when performing an exhaustive search on an unbalanced tree. The tree is generated on the fly using a splittable random number generator (RNG) that allows the random stream to be split and processed in parallel while still producing a deterministic tree. The splittable RNG has been constructed using the SHA1 secure hash algorithm. Thus, generating a node's children requires multiple applications of the SHA1 hash algorithm to generate splittable hashes for each child.

Project Members

  • The Ohio State University
    • Professor P. Sadayappan
    • James Dinan
    • Gerald Sabin
  • University of Maryland
    • Professor Chau-Wen Tseng
  • University of North Carolina
    • Professor Jan Prins
    • Stephen Olivier

Load Balancing

The UTS benchmark presents significant load imbalance problems to any parallelization scheme. The size and shape of the tree are random and inferring these parameters from any node in the tree would require a shortcut to solving the SHA-1 secure hash algorithm. Thus, any static work distribution scheme is highly unlikely to yield good performance for UTS.

We have investigated a number of load balancing schemes in the context of UTS. Most schemes fall into two categories: work stealing and work sharing.

Work Stealing

Work stealing is a distributed load balancing scheme where idle processors poll their peers in the computation for work. Depending on the programming model stealing may be done in a one-sided manner (e.g. in UPC, OpenMP, Shmem, etc) or in a two sided manner (e.g. in MPI). When a ''thief'' finds a ''victim'' with work, it steals a portion of the victim's work and is able to continue processing.

Work Sharing

Work sharing uses a centralized mechanism for distributing work. Work sharing can be accomplished through the use of work servers or shared work queues. In these schemes, workers send new tasks (in the case of UTS these are tree nodes) to the server or enqueue them in the queue. When a worker exhausts its work it can get more from the server or the shared queue.

Implementations of UTS

Our distribution of UTS includes implementations for:

  • MPI
  • UPC
  • OpenMP
  • Shmem
  • GPShmem
  • Pthreads
  • Cray MTA/XMT

Implementations in additional parallel models include:


Scalable Dynamic Load Balancing Using UPC PDF

Stephen Olivier, Jan Prins.

Proc. of 37th International Conference on Parallel Processing (ICPP-08). Portland, OR, September 2008.

A Message Passing Benchmark for Unbalanced Applications DOI

James Dinan, Stephen Olivier, Gerald Sabin, Jan Prins, P. Sadayappan, Chau-Wen Tseng.

J. Simulation Modelling Practice and Theory (SIMPAT). Volume 16, Issue 9, Pages 1177-1189. October, 2008.

Dynamic Load Balancing of Unbalanced Computations Using Message Passing PDF

James Dinan, Stephen Olivier, Jan Prins, Gerald Sabin, P Sadayappan and Chau-Wen Tseng.

Proc. of 6th Intl. Workshop on Performance Modeling, Evaluation, and Optimization of
Parallel and Distributed Systems (PMEO-PDS 2007). Long Beach, CA, March 26-30, 2007.

UTS: An Unbalanced Tree Search Benchmark PDF

Stephen Olivier, Jun Huan, Jinze Liu, Jan Prins, James Dinan, P Sadayappan and Chau-Wen Tseng.

Proc. 19th Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC). 2006.

UPC Implementation of an Unbalanced Tree Search Benchmark PDF

Jan Prins, Jun Huan, Bill Pugh, Chau-Wen Tseng, P. Sadayappan.

UNC Dept. of Computer Science Technical Report 03-034, Oct 2003.

Articles That Use UTS

Integrating Asynchronous Task Parallelism with MPI DOI

S. Chatterjee, S. Tasırlar, Z. Budimlic, V. Cave, M. Chabbi, M. Grossman, V. Sarkar, and Yonghong Yan

Proc. 27th International Parallel and Distributed Processing Symposium (IPDPS). May, 2013.

Work Stealing for Multi-core HPC Clusters DOI

Kaushik Ravichandran, Sangho Lee and Santosh Pande

Europar 2011, Lecture Notes in Computer Science, Volume 6852, Pages 205-217, 2011.

An Adaptive Framework for Large-scale State Space Search Link

Yanhua Sun, Gengbin Zheng, Pritish Jetley, and Laxmikant Kale.

Workshop on Large Scale Parallel Processing at IPDPS (LSPP) 2011

Unbalanced Tree Search on a Manycore System Using the GPI Programming Model DOI

Rui Machado, Carsten Lojewski, Salvador Abreu, and Franz-Josef Pfreundt

Computer Science Research and Development, Special Issue on ISC '11. Volume 26, Numbers 3-4, Pages 229-236.