Menu

Is this really safe?

2004-05-07
2004-05-14
  • Nobody/Anonymous

    My concern is about how you pick the memory address. 

    What if you need many, large shared containers and wish to share them amongst many applications with differing memory requirements?  For example, if some applications use 2G or more of memory, what are the safe memory addresses that you're sure won't be used across all of your applications?  Couldn't picking the wrong address cause a segment violation or, worse, overwrite the stack or heap?

    Willing, but concerned,
    Jason Haynberg
    Trading Systems Developer

     
    • Marc Bumble

      Marc Bumble - 2004-05-07

      The key_addr  is used as  the first argument  in the mmap call  in the
      shared_memory.h file  to map the  shared memory to a  preferred memory
      address.   What  effectively  is  happening  is  that  the  key_addr[]
      location is  being requested.  If  the location is not  available, the
      request  will be  denied,  and the  mmap  call will  fail causing  the
      allocator  to  throw  an  mmap_exception.  The  mmap_exception  is  an
      internally generated  allocator exception type.   See the man  page on
      mmap for more info.

      The operating system provides the memory.  A segmentation violation is
      not possible  as the OS would  not provide such  an incorrect address.
      Nor will the OS provide an illegal address from the heap or stack.

      The whole system is based on cooperative processes using shared memory
      though, so  if the  processes do not  use the semaphore  interlocks to
      coordinate access to  the shared objects or if  one process runs amok,
      problems are likely to occur.  The original intent was to allow object
      level locking  on sets of objects.   Of course the whole  project is a
      work in  process.  I  currently do not  have research funding  for the
      summer,  so I  am available  if anyone  would like  to contract  me to
      explore specific  issues through educational  grants.  Otherwise, feel
      free to submit questions via email and I will answer them as best as I
      can.  The system is GPL and comes with no warranties.

      Marc

       
    • Nobody/Anonymous

      Thanks for the reply. 

      Well, OK, forget about the seg violation part.  How can you pick a realistic address range (or ranges) that you can guarantee is not already used across many different applications? 

      Also, do you know of major corporations using your software?  I noticed it is marked beta - do you feel it's production safe?

      Thanks!
      Jason

       
      • Marc Bumble

        Marc Bumble - 2004-05-10

        Jason,

        Its open source.  Either you or  an independent agent hired by you can
        determine if it is safe for yourselves.  Further, you can publish your
        results.  It comes  with no warranties.  I have  published articles on
        this allocator at two peer reviewed conferences.

        If an address range is selected which is already in use, the allocator
        throws an  internally generated mmap exception.  One  concern would be
        synchronizing  multiple  processes  to   recover  and  then  have  all
        participating processes  reselect the same new address  range which is
        unused  address space  across  all of  the  processes' virtual  memory
        spaces.

        I have no idea who is using the allocator.  Corporations or otherwise.
        I do respond  to queries for fixes and for features,  which is sort of
        how it develops.   Also, having seen the internals  for fielded US Air
        Traffic Control  software, I  am no longer  sure what  production safe
        means.

        That being said,  I would appreciate any feedback  you provide and any
        suggestions  for improving  the package.   I am  currently  working on
        better   encapsulating  the  classes   and  accelerating   the  memory
        allocation performance.

        How  are  you  interested   in  applying  the  allocator?   Are  there
        attributes which would make you feel more comfortable about its design
        and safety?   What do  other software  writers do to  give you  a warm
        fuzzy that their software is safe and effective for your use?

        Marc

         
    • Nobody/Anonymous

      > If an address range is selected which is already in use ... have all participating processes reselect the same new address range which is unused address space across all of the processes' virtual memory spaces.

      So then all the applications would need to agree on some algorithm for increasing the starting address, say by 1MB increments.  And they would also need to agree on some algorithm to allow only one application to create the mmap region; for example, if multiple applications try to create the same mmaped region at the same time, one should succeed and the others should react to the error appropriately, that is, open it again without the create flag.  Do you agree?

      > I am no longer sure what production safe means

      Ouch :-)  I see what you mean, I too have been surprised by so-called production level code in commercial server software and real-time trading systems.

      >  I would appreciate any feedback you provide

      Thank you.  A related question, what happens when the container needs more memory than is available in the mmaped region?

      > How are you interested in applying the allocator?

      We need an easy mechanism to share data amongst applications.  We're looking at alternatives, like yours, though we may decide to write a simple interface that merely reads/writes data to an mmapped fixed-sized C-style array with a header containing the necessary locks. 

      > Are there attributes which would make you feel more comfortable about its design and safety?

      Admittedly, picking the actual memory address of where an object should reside feels very unnatural, non-standard, and undefined.  Though, I do see why you did it - basically, the use of STL iterators, which are ultimately pointers, mandates this.

      > What do other software writers do to give you a warm fuzzy that their software is safe and effective for your use?

      The more major corporations that use it, the better the perception is.  Just like in commercial software, those that have the bigger marquee clients are usually considered better.

      Jason

       
      • Marc Bumble

        Marc Bumble - 2004-05-11

        > "Jason"  writes:

          >> If an address range is selected which is already in use ... have
          >> all participating
          >> processes reselect the same new address range which is unused
          >> address space across all of the processes' virtual memory spaces.

          > So then all the applications would need to agree on some algorithm
          > for increasing the starting address, say by 1MB increments.  And
          > they would also need to agree on some algorithm to allow only one
          > application to create the mmap region; for example, if multiple
          > applications try to create the same mmaped region at the same
          > time, one should succeed and the others should react to the error
          > appropriately, that is, open it again without the create flag.  Do
          > you agree?

        I think its more complicated  than that.  The allocator is designed to
        be  used by  separate  processes, each  of  which has  it own  virtual
        address  space.  If each  application is  an individual  process, each
        will have its own address space,  so all of the mmaps can succeed.  So
        there is  only a potential  conflict if within an  individual process,
        something  has already  been mapped  into the  desired  memory region.
        With shm_open(), on the other hand, one process will create the shared
        memory segment and the rest will open the existing segment.

          >> I am no longer sure what production safe means

          > Ouch :-) I see what you mean, I too have been surprised by
          > so-called production level code in commercial server software and
          > real-time trading systems.

          >> I would appreciate any feedback you provide

          > Thank you.  A related question, what happens when the container
          > needs more memory than is available in the mmaped region?

        The allocator creates new shared memory segments and uses the space
        from these new segments.

          >> How are you interested in applying the allocator?

          > We need an easy mechanism to share data amongst applications.
          > We're looking at alternatives, like yours, though we may decide to
          > write a simple interface that merely reads/writes data to an
          > mmapped fixed-sized C-style array with a header containing the
          > necessary locks.

        You might also  use files and flock() which  would give you persistent
        storage.  The allocator was developed for sharing data through the STL
        containers between mutlple processes at relatively high speed.

        Marc

         
    • Nobody/Anonymous

      Some notes:

      "Use of MAP_FIXED may result in unspecified behaviour in further use of brk(2), sbrk(2), malloc(3C), and shmat(2).  The use of MAP_FIXED is discouraged, as it may prevent an implementation from making the most effective use of resources."

      - Solaris 8 mmap man page

      "MAP_FIXED: Use of this option is discouraged."

      - Linux Red Hat Enterprise 3 mmap man page

      I noticed in shared_memory.h, you comment out the use of MAP_FIXED, in favor of MAP_SHARED - I assume you check later if the address returned from mmap is not the address specified as the start address?  What made you not use MAP_FIXED?

      Jason

       
      • Marc Bumble

        Marc Bumble - 2004-05-11

        > "Jason"  writes:

        > I noticed in shared_memory.h, you comment out the use of MAP_FIXED,
        > in favor  of MAP_SHARED - I  assume you check later  if the address
        > returned  from mmap  is  not  the address  specified  as the  start
        > address?  What made you not use MAP_FIXED?

        I initially experimented with  MAP_FIXED, but switched to MAP_SHARED
        based on the man page reccomendations, as I recall.  The return value
        check is directly below the mmap call.

         
    • Nobody/Anonymous

      > So there is only a potential conflict if within an individual process, something has already been mapped into the desired memory region.

      This is exactly my point. 

      [ and the related ]
      > [ When the container needs more memory than is available in the mmaped region, the ] allocator creates new shared memory segments and uses the space from these new segments.

      Don't you agree that this problem can happen multiple times and can get complicated?  The problem being trying to find an unused address space across multiple applications - whether it be during the first use (of, lets say, a vector) or during every re-allocation (that is, when the vector needs more memory than its capacity). 

       
      • Marc Bumble

        Marc Bumble - 2004-05-11

        >  Jason writes:

        >> So  there is  only a  potential  conflict if  within an  individual
        >> process, something has already  been mapped into the desired memory
        >> region.
        >
        > This is exactly my point. 
        >
        > [ and the related ]
        >
        >> [ When  the container  needs more memory  than is available  in the
        >> mmaped region,  the ] allocator creates new  shared memory segments
        >> and uses the space from these new segments.
        >
        > Don't you agree that this problem can happen multiple times and can
        > get  complicated?   The problem  being  trying  to  find an  unused
        > address space  across multiple applications - whether  it be during
        > the  first   use  (of,  lets   say,  a  vector)  or   during  every
        > re-allocation (that is, when the  vector needs more memory than its
        > capacity).

        I haven't experienced this problem and no one has yet reported this as
        a problem.  Easiest way to find out is to test it.

        My use of  the allocator is to generally  initialize STL objects early
        and keep some around for the  duration of the process.  I usually have
        a set of objects which are modified over time.

        Marc

         
    • Nobody/Anonymous

      The allocator relies on the mmap system call to map a file into memory at a specific address in the process space.  The following is a program that will test which addresses can be used.  It prints out the address ranges which succeed, that is, when the address returned from mmap is the same that was requested.

      #include <sys/mman.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      #include <unistd.h>
      #include <iostream>
      using namespace std;

      main(int argc, char **argv)
      {
        if (argc < 2) {
          cerr << "usage: " << argv[0] << " <size of mmaped file>" << endl;
          exit(1);
        }

        size_t MEG = 1024 * 1024,
               size = atoi(argv[1]) * MEG,
               low = 0,
               high = 0;

        bool first = true, match = false;

        const char *file = "test";

        int fd = open(file, O_RDWR | O_CREAT, 0666);
        if (fd == -1) {
          perror("open");
          exit(1);
        }

        int res = ftruncate(fd, size);
        if (res == -1) {
          perror("ftruncate");
          exit(1);
        }

        // in 1M increments, loop over the 1M - 4G address range, trying to mmap a
        // file at an exact location; print out the address ranges that succeed
        for (size_t i = MEG; i; i += MEG) {  // at 4G, i will wrap around and equal 0

          void *p = mmap((char*)i, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
          if (p == MAP_FAILED) {
            perror("mmap");
            exit(1);
          }

          if (p == (void*)i) {
            match = true;
            if (first)  {
              first = false;
              low = i;
            }
            high = i;
          } else if (match) {
              match = false;
              first = true;
              cout << low/MEG << "M to " << high/MEG << "M" << endl;
          }

          if (munmap((char*)p, size) == -1) {
            perror("munmap");
            exit(1);
          }

        }

        if (close(fd) == -1) {
          perror("close");
          exit(1);
        }

        if (unlink(file) == -1) {
          perror("unlink");
          exit(1);
        }

        exit(0);
      }

      Here's the output on a Red Hat 9 machine.

        $ a.out 100
        1M to 28M
        129M to 924M
        1058M to 2971M

      And a Solaris 8 machine.

        $ a.out 100
        3980M to 3980M

      This proves that the allocator can't work on Solaris, and further points to other problems with portability. 

      I believe this is the reason why other people haven't tried to create an STL based containter using shared memory.  In the implementations I've seen of shared memory containers, instead of storing pointer addresses in shared memory (which are useful only if all processes have the shared memory mapped at the same location), offsets are used.  This effectively rules out using STL based containers and iterators.

      Regards,
      Jason Haynberg

       
      • Marc Bumble

        Marc Bumble - 2004-05-14

        How does the mmap man page  on Solaris describe the start parameter of
        mmap()?  Does it work better for smaller mapped files, for example for
        1M or  10M files.  Or is  the parameter meaningless  under Solaris?  I
        don't have ready access to a Solaris box.

        If you need  to work on Solaris,  this would seem to be  an issue, but
        from my perspective, its more an issue with the Solaris implementation
        of  mmap().  Although,  I can  see how  individuals familiar  with the
        Solaris  implementation of  mmap might  feel comfortable  with  it.  I
        can't  envision  why  simple  processes  could  not  permit  requested
        mappings to occur.  The virtual address space is so large.  Seems like
        a very lazy implementation of mmap().

        I  agree that  given your  experiment, this  allocator  solution would
        appear to not be a good solution for your needs.

        Marc

         
    • Nobody/Anonymous

      > How does the mmap man page on Solaris describe the start parameter of
      mmap()?

      Same as Linux, that it's meant as a hint to where the mapping should be placed. 

      > Does it work better for smaller mapped files, for example for
      1M or 10M files.

      No.  It seems Solaris always tries to place the mapping as far as it can to the 4G limit.

      > If you need to work on Solaris, this would seem to be an issue, but
      from my perspective, its more an issue with the Solaris implementation
      of mmap().

      Though that may be true, a library shouldn't rely on platform-dependent features, unless it's not positioned as general purpose.

      Thanks for all your help.

      Jason

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.