Menu

#244 PDL::IO::FlexRaw mapflex memory mapping fails

closed-fixed
nobody
other (94)
3
2010-11-26
2010-07-17
No

There seems to be a deep bug in PDL::IO::FlexRaw that prevents mapflex from working.

The problem is not with mmapping per se. The mapfraw function in FastRaw does not encounter trouble. As far as I can tell, the difference is that FastRaw does not use the set_data_by_offset, but FlexRaw does, because FlexRaw allows you to work with multiple piddles in the same file. The set_data_by_offset function sets the data member of the piddle struct, but it never provides any magic for undoing that action. As far as I can tell, piddles that are mmapped using FlexRaw do not perform any refcounting on the underlying mmapped data, so it never knows when it goes out of scope (and therefore it never unmaps the data?)

AFAIK, only FlexRaw uses set_data_by_offset, and FlexRaw's use was never tested, even with the Fortran based code. set_data_by_offset has not been covered in our test suite until my latest additions to the test suite for FlexRaw. We could have had this broken function hanging around for a long time, just not known about it.

As of the time of writing (commit 25ffbb5edd), tests 5 (A piddle and it's mapflex representation should be about equal) and 6 (Modifications to mapfraw should be saved to disk no later than when the piddle ceases to exist) fail, and the test script completely croaks before getting to the last two tests.

Discussion

  • David Mertens

    David Mertens - 2010-07-17

    Here's my output from perldl -V:

    perlDL shell v1.352
    PDL comes with ABSOLUTELY NO WARRANTY. For details, see the file
    'COPYING' in the PDL distribution. This is free software and you
    are welcome to redistribute it under certain conditions, see
    the same file for details.

    Summary of my PDL configuration

    VERSION: PDL v2.4.6_015 (supports bad values)

    $%PDL::Config = {
    'BADVAL_PER_PDL' => '0',
    'WITH_PROJ' => undef,
    'FFTW_TYPE' => 'double',
    'FFTW_LIBS' => [
    '/lib',
    '/usr/lib',
    '/usr/local/lib'
    ],
    'WITH_FFTW' => undef,
    'GSL_LIBS' => undef,
    'GL_BUILD' => '1',
    'WITH_IO_BROWSER' => '0',
    'PROJ_INC' => undef,
    'WHERE_PLPLOT_INCLUDE' => undef,
    'WITH_KARMA' => '0',
    'WHERE_KARMA' => undef,
    'HTML_DOCS' => '1',
    'WHERE_PLPLOT_LIBS' => undef,
    'WITH_3D' => '1',
    'FFTW_INC' => [
    '/usr/include/',
    '/usr/local/include'
    ],
    'WITH_POSIX_THREADS' => '1',
    'POGL_VERSION' => '0.63',
    'HIDE_TRYLINK' => '1',
    'WITH_HDF' => undef,
    'HDF_INC' => undef,
    'POGL_WINDOW_TYPE' => 'glut',
    'OPENGL_LIBS' => '-L/usr/lib/mesa -L/usr/lib/ -L/usr/lib/mesa -lGLU -lGL -lXext -lX11 -lm',
    'WITH_BADVAL' => '1',
    'WITH_GD' => undef,
    'FITS_LEGACY' => '1',
    'WITH_SLATEC' => undef,
    'BADVAL_USENAN' => '0',
    'WITH_DEVEL_REPL' => '1',
    'TEMPDIR' => '/tmp',
    'PROJ_LIBS' => undef,
    'USE_POGL' => '0',
    'GD_LIBS' => undef,
    'GSL_INC' => undef,
    'GD_INC' => undef,
    'WITH_GSL' => undef,
    'OPTIMIZE' => undef,
    'HDF_LIBS' => undef,
    'MALLOCDBG' => {},
    'WITH_MINUIT' => undef,
    'WITH_PLPLOT' => '1',
    'MINUIT_LIB' => undef
    };
    Summary of my perl5 (revision 5 version 10 subversion 1) configuration:

    Platform:
    osname=linux, osvers=2.6.24-27-server, archname=i486-linux-gnu-thread-multi
    uname='linux vernadsky 2.6.24-27-server #1 smp fri mar 12 01:45:06 utc 2010 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i486-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.10.1 -Dsitearch=/usr/local/lib/perl/5.10.1 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.10.1 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
    Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.4.3', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
    Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /usr/lib64
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.11.1.so, so=so, useshrplib=true, libperl=libperl.so.5.10.1
    gnulibc_version='2.11.1'
    Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

     
  • Chris Marshall

    Chris Marshall - 2010-07-17

    Does mapfraw work from IO::FastRaw?
    If so, does mapflex work with a single piddle?

     
  • Chris Marshall

    Chris Marshall - 2010-07-17

    From looking at Core.xs and the args to set_data_by_mmap() it appears that
    the mmap is set up with shared equal 0 by default which means that any
    changes are private and would not flow back to the global memory or file
    object. I think the call needs to use MAP_SHARED in order for changes to
    be visible.

     
  • David Mertens

    David Mertens - 2010-07-17

    > Does mapfraw work from IO::FastRaw?
    Yes

    > If so, does mapflex work with a single piddle?
    No

    > From looking at Core.xs and the args to set_data_by_mmap() it appears that
    > the mmap is set up with shared equal 0 by default which means that any
    > changes are private and would not flow back to the global memory or file
    > object. I think the call needs to use MAP_SHARED in order for changes to
    > be visible.
    Maybe, but FastRaw works. The tests that fail for FlexRaw were just copied
    over from FastRaw and modified so they used the correct calling conventions.

     
  • Chris Marshall

    Chris Marshall - 2010-08-06
    • labels: 101696 --> other
    • assigned_to: run4flat --> nobody
    • priority: 5 --> 3
    • status: open --> open-postponed
     
  • Chris Marshall

    Chris Marshall - 2010-08-06

    Due to the lack of time before the PDL-2.4.7 release, I'm marking this
    bug as Postponed for revisiting after the coming release.

     
  • Chris Marshall

    Chris Marshall - 2010-08-10
    • summary: FlexRaw mapflex fails --> PDL::IO::FlexRaw mapflex memory mapping fails
     
  • Chris Marshall

    Chris Marshall - 2010-10-12

    One direction that might be worth investigating is
    to use the File::Map module for mmaping to perl
    scalars instead of hand-rolled code. I don't know
    if that would work for the needs of PDL but if so,
    it could make mmap work cross-platform since the
    File::Map module supports win32 and other OSes.

     
  • Chris Marshall

    Chris Marshall - 2010-10-12
    • milestone: 101029 -->
    • status: open-postponed --> open
     
  • Chris Marshall

    Chris Marshall - 2010-11-11

    Looking at some recent CPAN Testers failures for *BSD
    systems with t/flexraw.t and PDL-2.4.7_004 I tracked
    down some issues here:

    (1) There does appear to be a problem with the
    scope of the pdl that is the whole mmap'd file. It
    should probably have a refcount for each piddle
    that is mapped via the offset. When I keep a ref
    to the per-file pdl from mapflex then access to the
    mmapped data works.

    (2) By default mapflex() maps the files in ReadOnly
    mode. If you add the { ReadOnly=>0 } option to the
    mapflex calls then the updates to the disk data work
    as expected.

    (3) I don't know what the message
    Warning: special data without datasv is not freed currently!!
    means but it seems to be related to freeing of pdls.

    --Chris

     
  • Chris Marshall

    Chris Marshall - 2010-11-11

    It turns out that ReadOnly was set by default to 1 for mapflex,
    the docs say it should default to false. Fixing that makes the
    t/flexraw.t tests pass. It still doesn't address the problem of
    more than one piddle being mmapped from a file via offset
    from a parent piddle for the whole file.

     
  • Chris Marshall

    Chris Marshall - 2010-11-12
    • status: open --> pending-fixed
     
  • Chris Marshall

    Chris Marshall - 2010-11-12

    Fixed in git and available on CPAN for PDL versions
    2.4.7_005 and higher. On the way to the fix, I noted
    that it would be might be useful to revisit the API for
    writeflex and mapflex as far as creating/writing data
    files. As is, it is fairly clunky and not clear how things
    work with multiple pdls in the same file...

     
  • SourceForge Robot

    This Tracker item was closed automatically by the system. It was
    previously set to a Pending status, and the original submitter
    did not respond within 14 days (the time period specified by
    the administrator of this Tracker).

     
  • SourceForge Robot

    • status: pending-fixed --> closed-fixed
     

Log in to post a comment.