#30 support for piddles >2**31 elements

None
closed
None
7
2013-09-14
2010-08-04
No

With the common availability of 64bit OSes and machines
with more than 4GiB of memory, it is apparent that the
current PDL size is limited by the implementation to 2GiB
(pdl_malloc and hence SvGROW are called with sizes
as int instead of STRLEN).

The allocation limit fix should be straightforward. The
difficulty will be verifying that there are not hidden
assumptions about the size of piddles in the other
code (e.g., the PDL::PP stuff).

Thanks to Albrecht Schmid for reporting the issue.

Discussion

  • Chris Marshall

    Chris Marshall - 2010-08-04

    There are a bunch of places that assume nvals is an
    int in the PDL core routines. Those will need to be
    modified. Any routines accessing the internal pdl
    structure will need to be modified if it uses nvals.

    This is too much to do and verify before the PDL-2.4.7
    release but we will plan to address the problem after
    that release occurs.

     
  • Chris Marshall

    Chris Marshall - 2010-10-13

    In addition to the 2GiB limit on the number
    of bytes in the pdl data, any conversion to
    a perl array implies a maximum of ~2**31
    elements (not bytes) since the array keys
    are represented internally by I32 values.

     
  • Chris Marshall

    Chris Marshall - 2010-11-28

    Looking into this further: the return value of nelem
    would need to be changed to match nvals so the
    possible effect could include all usages of nelem
    that assume an int value. The checks for this would
    be more extensive than could be be verified before
    the planned 2.4.8 *minor* bugfix and polishing
    release.

     
  • Chris Marshall

    Chris Marshall - 2011-03-07

    In addition to fixes in the PDL Core routines,
    modifications will also be needed in the PP
    and other threading/indexing routines that
    assume that the PDL long datatype is sufficient
    to hold an index.

    This could be an important problem as a number
    of the routines do their magic on a flattened
    piddle which is exactly when the current use of
    long for offsets would break down.

     
  • Chris Marshall

    Chris Marshall - 2011-03-09

    It might be slightly easier to implement large pdl mode
    supporting pdl size types changing from long to unsigned long.
    Still would need to check that the PP code looping does
    not depend on negative offsets.

    True 64bit pdl support can only be on platforms with 64bit
    pointers. Need to check that $Config{ptrsize} == 8...

     
  • Chris Marshall

    Chris Marshall - 2011-07-25

    We also need to change dims and dimincs from PDL_long to the appropriate 64bit value.

     
  • Chris Marshall

    Chris Marshall - 2011-08-05
    • summary: support for piddles >2 GiB --> support for piddles >2**31 elements
     
  • Chris Marshall

    Chris Marshall - 2011-08-05

    Since the number of elements in a piddle are represented
    as an int, the maximum number of elements possible is
    ~2**31. Assuming 64-bit pointers and addressing, it should
    be possible to increase the maximum size of a piddle by
    changing the index type from int/long to some int64 type.

     
  • Chris Marshall

    Chris Marshall - 2011-08-07

    An additional note: supporting PDLs larger than 2GiB with less
    than 2**31 elements is a much easier fix than the true 64bit
    PDL which would require changing all the index types from ints
    to int64 types.

     
  • Chris Marshall

    Chris Marshall - 2012-04-03
    • priority: 5 --> 7
     
  • Chris Marshall

    Chris Marshall - 2012-04-03

    A more careful review of converting the PDL API to 64bits
    shows that in addition to changing the type of nvals to 64bits,
    all the PDL_Long incs, offs, def_incs, def_dims, and def_dimincs
    need to be converted to an appropriate 64bit type. See the
    definitions of pdl_trans_affine, pdl_vaffine, and pdl structures
    in pdl.h.

    It might make sense to make a typedef appropriate for PDL
    dimension increments in general so that it will be clear when
    something is dealing with that level of the API and data structures.

     
  • Chris Marshall

    Chris Marshall - 2012-05-19

    Work for this feature is underway in the 64bit-index-support branch of pdl git. Testers and contributers welcome! In particular, it would accelerate development if a volunteer could implement a 64bit-index-support.t test suite to check edge cases to ensure things work the same for small (<2GiB) and large (>4G-elements) piddles.

     
  • Chris Marshall

    Chris Marshall - 2013-09-14
    • status: open --> closed
    • assigned_to: Chris Marshall
    • Group: -->
     
  • Chris Marshall

    Chris Marshall - 2013-09-14

    64bit index support in PDL-2.006_07 and higher.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks