#30 support for piddles >2**31 elements

None
closed
None
7
2013-09-14
2010-08-04
Chris Marshall
No

With the common availability of 64bit OSes and machines
with more than 4GiB of memory, it is apparent that the
current PDL size is limited by the implementation to 2GiB
(pdl_malloc and hence SvGROW are called with sizes
as int instead of STRLEN).

The allocation limit fix should be straightforward. The
difficulty will be verifying that there are not hidden
assumptions about the size of piddles in the other
code (e.g., the PDL::PP stuff).

Thanks to Albrecht Schmid for reporting the issue.

Discussion

1 2 > >> (Page 1 of 2)
  • Chris Marshall
    Chris Marshall
    2010-08-04

    There are a bunch of places that assume nvals is an
    int in the PDL core routines. Those will need to be
    modified. Any routines accessing the internal pdl
    structure will need to be modified if it uses nvals.

    This is too much to do and verify before the PDL-2.4.7
    release but we will plan to address the problem after
    that release occurs.

     
  • Chris Marshall
    Chris Marshall
    2010-10-13

    In addition to the 2GiB limit on the number
    of bytes in the pdl data, any conversion to
    a perl array implies a maximum of ~2**31
    elements (not bytes) since the array keys
    are represented internally by I32 values.

     
  • Chris Marshall
    Chris Marshall
    2010-11-28

    Looking into this further: the return value of nelem
    would need to be changed to match nvals so the
    possible effect could include all usages of nelem
    that assume an int value. The checks for this would
    be more extensive than could be be verified before
    the planned 2.4.8 *minor* bugfix and polishing
    release.

     
  • Chris Marshall
    Chris Marshall
    2011-03-07

    In addition to fixes in the PDL Core routines,
    modifications will also be needed in the PP
    and other threading/indexing routines that
    assume that the PDL long datatype is sufficient
    to hold an index.

    This could be an important problem as a number
    of the routines do their magic on a flattened
    piddle which is exactly when the current use of
    long for offsets would break down.

     
  • Chris Marshall
    Chris Marshall
    2011-03-09

    It might be slightly easier to implement large pdl mode
    supporting pdl size types changing from long to unsigned long.
    Still would need to check that the PP code looping does
    not depend on negative offsets.

    True 64bit pdl support can only be on platforms with 64bit
    pointers. Need to check that $Config{ptrsize} == 8...

     
  • Chris Marshall
    Chris Marshall
    2011-07-25

    We also need to change dims and dimincs from PDL_long to the appropriate 64bit value.

     
  • Chris Marshall
    Chris Marshall
    2011-08-05

    • summary: support for piddles >2 GiB --> support for piddles >2**31 elements
     
  • Chris Marshall
    Chris Marshall
    2011-08-05

    Since the number of elements in a piddle are represented
    as an int, the maximum number of elements possible is
    ~2**31. Assuming 64-bit pointers and addressing, it should
    be possible to increase the maximum size of a piddle by
    changing the index type from int/long to some int64 type.

     
  • Chris Marshall
    Chris Marshall
    2011-08-07

    An additional note: supporting PDLs larger than 2GiB with less
    than 2**31 elements is a much easier fix than the true 64bit
    PDL which would require changing all the index types from ints
    to int64 types.

     
  • Chris Marshall
    Chris Marshall
    2012-04-03

    • priority: 5 --> 7
     
1 2 > >> (Page 1 of 2)