libbsr - allocation services for the Barrier Synchronization Register
Sonny Rao - 9/2008
What is the BSR ?
The BSR is a hardware register which is memory mapped by processes
which need very fast barrier synchronization. What it does is provide
a set of counters which can be updated by different processors and
have the updates propagate quickly to all the other processors. Using
normal memory to do this requires the cacheline with the counter to
"bounce" between different processors in the system and every
processor which needs to write to the memory must obtain ownership of
the cacheline and then cede it to the next processor which is doing an
update. Each counter is only 1 byte and there are a limited number of the
counters. The number of counters is what is referred to as the size
of the register. Typically the BSR can also be partitioned into
separate pieces which allow unrelated applications to use the resource
in isolation.
How is it used? or What's a typical application?
Typically in HPC (High Performance Computing) applications a large
data set is carved up into many pieces and all the processors in the
system operate on each piece concurrently. Applications are typically
divided into multiple phases and the different processors must
synchronize with each other in between the phases, and this is done
with a barrier of some kind. The BSR aims to provide the fastest
possible barrier mechanism.
This is only one possible method of using the hardware, and there are
many other possibilities.
What does this library do?
The library's purpose is to help applications to use the BSR device as
exported by the Linux kernel. The Linux kernel simply exports all
forms of the BSR as character device files and does not try to enforce
any policy on its usage. Since the BSR comes in different sizes and
especially since different pieces can alias on top of a larger piece,
applications need a way to query what is available and reserve
different pieces for their usage.
libbsr currently provides four functions:
bsr_query - get information on what sizes are available and the total
size of the BSR
bsr_alloc - reserve a bsr device
bsr_free - release a previously reserved bsr device
bsr_map - map a bsr device, uses mmap()