allocator / Discussion / Open Discussion: Is this really safe?

Nobody/Anonymous - 2004-05-07

My concern is about how you pick the memory address.

What if you need many, large shared containers and wish to share them amongst many applications with differing memory requirements? For example, if some applications use 2G or more of memory, what are the safe memory addresses that you're sure won't be used across all of your applications? Couldn't picking the wrong address cause a segment violation or, worse, overwrite the stack or heap?

Willing, but concerned,
Jason Haynberg
Trading Systems Developer

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Marc Bumble - 2004-05-07
  
  The key_addr is used as the first argument in the mmap call in the
  shared_memory.h file to map the shared memory to a preferred memory
  address.   What effectively is happening is that the key_addr[]
  location is being requested. If the location is not available, the
  request will be denied, and the mmap call will fail causing the
  allocator to throw an mmap_exception. The mmap_exception is an
  internally generated allocator exception type.   See the man page on
  mmap for more info.
  
  The operating system provides the memory. A segmentation violation is
  not possible as the OS would not provide such an incorrect address.
  Nor will the OS provide an illegal address from the heap or stack.
  
  The whole system is based on cooperative processes using shared memory
  though, so if the processes do not use the semaphore interlocks to
  coordinate access to the shared objects or if one process runs amok,
  problems are likely to occur. The original intent was to allow object
  level locking on sets of objects.   Of course the whole project is a
  work in process. I currently do not have research funding for the
  summer, so I am available if anyone would like to contract me to
  explore specific issues through educational grants. Otherwise, feel
  free to submit questions via email and I will answer them as best as I
  can. The system is GPL and comes with no warranties.
  
  Marc
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2004-05-10
  
  Thanks for the reply.
  
  Well, OK, forget about the seg violation part. How can you pick a realistic address range (or ranges) that you can guarantee is not already used across many different applications?
  
  Also, do you know of major corporations using your software? I noticed it is marked beta - do you feel it's production safe?
  
  Thanks!
  Jason
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Marc Bumble - 2004-05-10
    
    Jason,
    
    Its open source. Either you or an independent agent hired by you can
    determine if it is safe for yourselves. Further, you can publish your
    results. It comes with no warranties. I have published articles on
    this allocator at two peer reviewed conferences.
    
    If an address range is selected which is already in use, the allocator
    throws an internally generated mmap exception. One concern would be
    synchronizing multiple processes to   recover and then have all
    participating processes reselect the same new address range which is
    unused address space across all of the processes' virtual memory
    spaces.
    
    I have no idea who is using the allocator. Corporations or otherwise.
    I do respond to queries for fixes and for features, which is sort of
    how it develops.   Also, having seen the internals for fielded US Air
    Traffic Control software, I am no longer sure what production safe
    means.
    
    That being said, I would appreciate any feedback you provide and any
    suggestions for improving the package.   I am currently working on
    better   encapsulating the classes   and accelerating   the memory
    allocation performance.
    
    How are you interested   in applying the allocator?   Are there
    attributes which would make you feel more comfortable about its design
    and safety?   What do other software writers do to give you a warm
    fuzzy that their software is safe and effective for your use?
    
    Marc
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2004-05-10
  
  > If an address range is selected which is already in use ... have all participating processes reselect the same new address range which is unused address space across all of the processes' virtual memory spaces.
  
  So then all the applications would need to agree on some algorithm for increasing the starting address, say by 1MB increments. And they would also need to agree on some algorithm to allow only one application to create the mmap region; for example, if multiple applications try to create the same mmaped region at the same time, one should succeed and the others should react to the error appropriately, that is, open it again without the create flag. Do you agree?
  
  > I am no longer sure what production safe means
  
  Ouch :-) I see what you mean, I too have been surprised by so-called production level code in commercial server software and real-time trading systems.
  
  > I would appreciate any feedback you provide
  
  Thank you. A related question, what happens when the container needs more memory than is available in the mmaped region?
  
  > How are you interested in applying the allocator?
  
  We need an easy mechanism to share data amongst applications. We're looking at alternatives, like yours, though we may decide to write a simple interface that merely reads/writes data to an mmapped fixed-sized C-style array with a header containing the necessary locks.
  
  > Are there attributes which would make you feel more comfortable about its design and safety?
  
  Admittedly, picking the actual memory address of where an object should reside feels very unnatural, non-standard, and undefined. Though, I do see why you did it - basically, the use of STL iterators, which are ultimately pointers, mandates this.
  
  > What do other software writers do to give you a warm fuzzy that their software is safe and effective for your use?
  
  The more major corporations that use it, the better the perception is. Just like in commercial software, those that have the bigger marquee clients are usually considered better.
  
  Jason
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Marc Bumble - 2004-05-11
    
    > "Jason" writes:
    
    >> If an address range is selected which is already in use ... have
    >> all participating
    >> processes reselect the same new address range which is unused
    >> address space across all of the processes' virtual memory spaces.
    
    > So then all the applications would need to agree on some algorithm
    > for increasing the starting address, say by 1MB increments. And
    > they would also need to agree on some algorithm to allow only one
    > application to create the mmap region; for example, if multiple
    > applications try to create the same mmaped region at the same
    > time, one should succeed and the others should react to the error
    > appropriately, that is, open it again without the create flag. Do
    > you agree?
    
    I think its more complicated than that. The allocator is designed to
    be used by separate processes, each of which has it own virtual
    address space. If each application is an individual process, each
    will have its own address space, so all of the mmaps can succeed. So
    there is only a potential conflict if within an individual process,
    something has already been mapped into the desired memory region.
    With shm_open(), on the other hand, one process will create the shared
    memory segment and the rest will open the existing segment.
    
    >> I am no longer sure what production safe means
    
    > Ouch :-) I see what you mean, I too have been surprised by
    > so-called production level code in commercial server software and
    > real-time trading systems.
    
    >> I would appreciate any feedback you provide
    
    > Thank you. A related question, what happens when the container
    > needs more memory than is available in the mmaped region?
    
    The allocator creates new shared memory segments and uses the space
    from these new segments.
    
    >> How are you interested in applying the allocator?
    
    > We need an easy mechanism to share data amongst applications.
    > We're looking at alternatives, like yours, though we may decide to
    > write a simple interface that merely reads/writes data to an
    > mmapped fixed-sized C-style array with a header containing the
    > necessary locks.
    
    You might also use files and flock() which would give you persistent
    storage. The allocator was developed for sharing data through the STL
    containers between mutlple processes at relatively high speed.
    
    Marc
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2004-05-10
  
  Some notes:
  
  "Use of MAP_FIXED may result in unspecified behaviour in further use of brk(2), sbrk(2), malloc(3C), and shmat(2). The use of MAP_FIXED is discouraged, as it may prevent an implementation from making the most effective use of resources."
  
  - Solaris 8 mmap man page
  
  "MAP_FIXED: Use of this option is discouraged."
  
  - Linux Red Hat Enterprise 3 mmap man page
  
  I noticed in shared_memory.h, you comment out the use of MAP_FIXED, in favor of MAP_SHARED - I assume you check later if the address returned from mmap is not the address specified as the start address? What made you not use MAP_FIXED?
  
  Jason
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Marc Bumble - 2004-05-11
    
    > "Jason" writes:
    
    > I noticed in shared_memory.h, you comment out the use of MAP_FIXED,
    > in favor of MAP_SHARED - I assume you check later if the address
    > returned from mmap is not the address specified as the start
    > address? What made you not use MAP_FIXED?
    
    I initially experimented with MAP_FIXED, but switched to MAP_SHARED
    based on the man page reccomendations, as I recall. The return value
    check is directly below the mmap call.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2004-05-11
  
  > So there is only a potential conflict if within an individual process, something has already been mapped into the desired memory region.
  
  This is exactly my point.
  
  [ and the related ]
  > [ When the container needs more memory than is available in the mmaped region, the ] allocator creates new shared memory segments and uses the space from these new segments.
  
  Don't you agree that this problem can happen multiple times and can get complicated? The problem being trying to find an unused address space across multiple applications - whether it be during the first use (of, lets say, a vector) or during every re-allocation (that is, when the vector needs more memory than its capacity).
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Marc Bumble - 2004-05-11
    
    > Jason writes:
    
    >> So there is only a potential conflict if within an individual
    >> process, something has already been mapped into the desired memory
    >> region.
    >
    > This is exactly my point.
    >
    > [ and the related ]
    >
    >> [ When the container needs more memory than is available in the
    >> mmaped region, the ] allocator creates new shared memory segments
    >> and uses the space from these new segments.
    >
    > Don't you agree that this problem can happen multiple times and can
    > get complicated? The problem being trying to find an unused
    > address space across multiple applications - whether it be during
    > the first use (of, lets say, a vector) or during every
    > re-allocation (that is, when the vector needs more memory than its
    > capacity).
    
    I haven't experienced this problem and no one has yet reported this as
    a problem. Easiest way to find out is to test it.
    
    My use of the allocator is to generally initialize STL objects early
    and keep some around for the duration of the process. I usually have
    a set of objects which are modified over time.
    
    Marc
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2004-05-13
  
  The allocator relies on the mmap system call to map a file into memory at a specific address in the process space. The following is a program that will test which addresses can be used. It prints out the address ranges which succeed, that is, when the address returned from mmap is the same that was requested.
  
  #include <sys/mman.h>
  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>
  #include <unistd.h>
  #include <iostream>
  using namespace std;
  
  main(int argc, char **argv)
  {
  if (argc < 2) {
      cerr << "usage: " << argv[0] << " <size of mmaped file>" << endl;
      exit(1);
  }
  
  size_t MEG = 1024 * 1024,
           size = atoi(argv[1]) * MEG,
           low = 0,
           high = 0;
  
  bool first = true, match = false;
  
  const char *file = "test";
  
  int fd = open(file, O_RDWR | O_CREAT, 0666);
  if (fd == -1) {
      perror("open");
      exit(1);
  }
  
  int res = ftruncate(fd, size);
  if (res == -1) {
      perror("ftruncate");
      exit(1);
  }
  
  // in 1M increments, loop over the 1M - 4G address range, trying to mmap a
  // file at an exact location; print out the address ranges that succeed
  for (size_t i = MEG; i; i += MEG) { // at 4G, i will wrap around and equal 0
  
      void *p = mmap((char*)i, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
      if (p == MAP_FAILED) {
        perror("mmap");
        exit(1);
      }
  
      if (p == (void*)i) {
        match = true;
        if (first) {
          first = false;
          low = i;
        }
        high = i;
      } else if (match) {
          match = false;
          first = true;
          cout << low/MEG << "M to " << high/MEG << "M" << endl;
      }
  
      if (munmap((char*)p, size) == -1) {
        perror("munmap");
        exit(1);
      }
  
  }
  
  if (close(fd) == -1) {
      perror("close");
      exit(1);
  }
  
  if (unlink(file) == -1) {
      perror("unlink");
      exit(1);
  }
  
  exit(0);
  }
  
  Here's the output on a Red Hat 9 machine.
  
  $ a.out 100
  1M to 28M
  129M to 924M
  1058M to 2971M
  
  And a Solaris 8 machine.
  
  $ a.out 100
  3980M to 3980M
  
  This proves that the allocator can't work on Solaris, and further points to other problems with portability.
  
  I believe this is the reason why other people haven't tried to create an STL based containter using shared memory. In the implementations I've seen of shared memory containers, instead of storing pointer addresses in shared memory (which are useful only if all processes have the shared memory mapped at the same location), offsets are used. This effectively rules out using STL based containers and iterators.
  
  Regards,
  Jason Haynberg
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Marc Bumble - 2004-05-14
    
    How does the mmap man page on Solaris describe the start parameter of
    mmap()? Does it work better for smaller mapped files, for example for
    1M or 10M files. Or is the parameter meaningless under Solaris? I
    don't have ready access to a Solaris box.
    
    If you need to work on Solaris, this would seem to be an issue, but
    from my perspective, its more an issue with the Solaris implementation
    of mmap(). Although, I can see how individuals familiar with the
    Solaris implementation of mmap might feel comfortable with it. I
    can't envision why simple processes could not permit requested
    mappings to occur. The virtual address space is so large. Seems like
    a very lazy implementation of mmap().
    
    I agree that given your experiment, this allocator solution would
    appear to not be a good solution for your needs.
    
    Marc
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nobody/Anonymous - 2004-05-14
  
  > How does the mmap man page on Solaris describe the start parameter of
  mmap()?
  
  Same as Linux, that it's meant as a hint to where the mapping should be placed.
  
  > Does it work better for smaller mapped files, for example for
  1M or 10M files.
  
  No. It seems Solaris always tries to place the mapping as far as it can to the 4G limit.
  
  > If you need to work on Solaris, this would seem to be an issue, but
  from my perspective, its more an issue with the Solaris implementation
  of mmap().
  
  Though that may be true, a library shouldn't rely on platform-dependent features, unless it's not positioned as general purpose.
  
  Thanks for all your help.
  
  Jason
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Is this really safe?

Forums

Help

Is this really safe?

Is this really safe?

Forums

Help

Is this really safe? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Is this really safe?