The Unbalanced Tree Search (UTS) benchmark is a parallel benchmarking code that reports the performance achieved when performing an exhaustive search on an unbalanced tree. The tree is generated on the fly using a splittable random number generator (RNG) that allows the random stream to be split and processed in parallel while still producing a deterministic tree. The splittable RNG has been constructed using the SHA1 secure hash algorithm. Thus, generating a node's children requires multiple applications of the SHA1 hash algorithm to generate splittable hashes for each child.
The UTS benchmark presents significant load imbalance problems to any parallelization scheme. The size and shape of the tree are random and inferring these parameters from any node in the tree would require a shortcut to solving the SHA-1 secure hash algorithm. Thus, any static work distribution scheme is highly unlikely to yield good performance for UTS.
We have investigated a number of load balancing schemes in the context of UTS. Most schemes fall into two categories: work stealing and work sharing.
Work stealing is a distributed load balancing scheme where idle processors poll their peers in the computation for work. Depending on the programming model stealing may be done in a one-sided manner (e.g. in UPC, OpenMP, Shmem, etc) or in a two sided manner (e.g. in MPI). When a ''thief'' finds a ''victim'' with work, it steals a portion of the victim's work and is able to continue processing.
Work sharing uses a centralized mechanism for distributing work. Work sharing can be accomplished through the use of work servers or shared work queues. In these schemes, workers send new tasks (in the case of UTS these are tree nodes) to the server or enqueue them in the queue. When a worker exhausts its work it can get more from the server or the shared queue.
Our distribution of UTS includes implementations for:
Implementations in additional parallel models include:
Scalable Dynamic Load Balancing Using UPC PDF
Stephen Olivier, Jan Prins.
Proc. of 37th International Conference on Parallel Processing (ICPP-08). Portland, OR, September 2008.
A Message Passing Benchmark for Unbalanced Applications DOI
James Dinan, Stephen Olivier, Gerald Sabin, Jan Prins, P. Sadayappan, Chau-Wen Tseng.
J. Simulation Modelling Practice and Theory (SIMPAT). Volume 16, Issue 9, Pages 1177-1189. October, 2008.
Dynamic Load Balancing of Unbalanced Computations Using Message Passing PDF
James Dinan, Stephen Olivier, Jan Prins, Gerald Sabin, P Sadayappan and Chau-Wen Tseng.
Proc. of 6th Intl. Workshop on Performance Modeling, Evaluation, and Optimization of
Parallel and Distributed Systems (PMEO-PDS 2007). Long Beach, CA, March 26-30, 2007.
UTS: An Unbalanced Tree Search Benchmark PDF
Stephen Olivier, Jun Huan, Jinze Liu, Jan Prins, James Dinan, P Sadayappan and Chau-Wen Tseng.
Proc. 19th Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC). 2006.
UPC Implementation of an Unbalanced Tree Search Benchmark PDF
Jan Prins, Jun Huan, Bill Pugh, Chau-Wen Tseng, P. Sadayappan.
UNC Dept. of Computer Science Technical Report 03-034, Oct 2003.
Integrating Asynchronous Task Parallelism with MPI DOI
S. Chatterjee, S. Tasırlar, Z. Budimlic, V. Cave, M. Chabbi, M. Grossman, V. Sarkar, and Yonghong Yan
Proc. 27th International Parallel and Distributed Processing Symposium (IPDPS). May, 2013.
Work Stealing for Multi-core HPC Clusters DOI
Kaushik Ravichandran, Sangho Lee and Santosh Pande
Europar 2011, Lecture Notes in Computer Science, Volume 6852, Pages 205-217, 2011.
An Adaptive Framework for Large-scale State Space Search Link
Yanhua Sun, Gengbin Zheng, Pritish Jetley, and Laxmikant Kale.
Workshop on Large Scale Parallel Processing at IPDPS (LSPP) 2011
Unbalanced Tree Search on a Manycore System Using the GPI Programming Model DOI
Rui Machado, Carsten Lojewski, Salvador Abreu, and Franz-Josef Pfreundt
Computer Science Research and Development, Special Issue on ISC '11. Volume 26, Numbers 3-4, Pages 229-236.