allocator / Discussion / Open Discussion: data placement for openmp

data placement for openmp

Forum: Open Discussion

Creator: Nobody/Anonymous

Created: 2004-08-20

Updated: 2004-08-20

Nobody/Anonymous - 2004-08-20

I'm interested in using this library for my C++ simulation codes. I think the library has many potentials to improve performance for OpenMP-parallel C++ codes on ccNuma machines (SGI altix is the main target).

We find that the data placement is very critical for the performance and scaling for our simulations. To do so, we have developed a simple allocator that manages a chunk of memory on each physical processor. The main objects are shared but they reside on the physical processor by the allocator. In our simulations, we can structure the code so that the objects are read-only by remote processors, while the processor who owns the object can read and write. However, the classes are not generic so that I can use the allocator for any object nor compatible with STL.

My question is if it is possible to use (with some modifications if necessary) your library to control data placement to each physical processor to implement the ideas I described above?

Thanks in advance.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Marc Bumble - 2004-08-20
  
  Perhaps I   do not understand   your question.
  Effectively, this   allocator works by calling
  shm_open(), a POSIX system call, to open a UNIX
  shared memory segment. Then, mmap is used to map
  that shared memory segment to an address specified
  by the allocator parameter keys. The rest of the
  allocator deterministically divides up the memory
  and provides element sized portions to the STL
  container. Shared access is provided by allowing
  secondary processes to use an identical key to
  open existing shared memory segments and attach to
  existing   containers placed   in   those shared
  segments.
  
  What you could perhaps do is, assuming you have
  enough control, allow the different processor
  processes to have separate keys. So assuming you
  have 4 processors, 1, 2, 3, and 4.   Processor 1
  could create allocators using keys 1, 5, 9, 13,
  ...   Processor 2 could create allocators using
  keys 2, 6, 10, 14, ..., and so forth. Processor 2
  could access the elements created by Processor 1
  and stored using key 1.   And Processor 1 could
  access the elements stored in the allocator using
  key 6 created by Processor 2. The issue would
  just be having the underlying system calls create
  the shared memory segments in memory controlled by
  specific   processors.   This   assumes   you are
  describing a single multiprocessor machine. Also,
  it assumes that the processes are not migrated by
  the operating system among the processors, in
  which case   the situation is   obviously more
  dynamic.   If   system memory is   allocated to
  processors based on physical address, then perhaps
  your goal is reasonable.
  
  Also, if you are thinking about memory control in
  terms of say how elements in a matrix are aligned
  in memory, you do not have that much control with
  C++ STL allocators. The allocators simply request
  blocks   of adjacent   memory cells   for their
  containers. So the container elements end up in
  adjacent cells, but its not quite the same as row
  major versus column major oriented array layouts,
  as in Fortran.   There is not that level of
  control.
  
  Please accept my apologies if I have misunderstood
  your question. Thank you for your interest.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

data placement for openmp

Forums

Help

data placement for openmp

data placement for openmp

Forums

Help

data placement for openmp document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

data placement for openmp