#30 support for piddles >2**31 elements

Chris Marshall

With the common availability of 64bit OSes and machines
with more than 4GiB of memory, it is apparent that the
current PDL size is limited by the implementation to 2GiB
(pdl_malloc and hence SvGROW are called with sizes
as int instead of STRLEN).

The allocation limit fix should be straightforward. The
difficulty will be verifying that there are not hidden
assumptions about the size of piddles in the other
code (e.g., the PDL::PP stuff).

Thanks to Albrecht Schmid for reporting the issue.


1 2 > >> (Page 1 of 2)
  • Chris Marshall
    Chris Marshall

    There are a bunch of places that assume nvals is an
    int in the PDL core routines. Those will need to be
    modified. Any routines accessing the internal pdl
    structure will need to be modified if it uses nvals.

    This is too much to do and verify before the PDL-2.4.7
    release but we will plan to address the problem after
    that release occurs.

  • Chris Marshall
    Chris Marshall

    In addition to the 2GiB limit on the number
    of bytes in the pdl data, any conversion to
    a perl array implies a maximum of ~2**31
    elements (not bytes) since the array keys
    are represented internally by I32 values.

  • Chris Marshall
    Chris Marshall

    Looking into this further: the return value of nelem
    would need to be changed to match nvals so the
    possible effect could include all usages of nelem
    that assume an int value. The checks for this would
    be more extensive than could be be verified before
    the planned 2.4.8 *minor* bugfix and polishing

  • Chris Marshall
    Chris Marshall

    In addition to fixes in the PDL Core routines,
    modifications will also be needed in the PP
    and other threading/indexing routines that
    assume that the PDL long datatype is sufficient
    to hold an index.

    This could be an important problem as a number
    of the routines do their magic on a flattened
    piddle which is exactly when the current use of
    long for offsets would break down.

  • Chris Marshall
    Chris Marshall

    It might be slightly easier to implement large pdl mode
    supporting pdl size types changing from long to unsigned long.
    Still would need to check that the PP code looping does
    not depend on negative offsets.

    True 64bit pdl support can only be on platforms with 64bit
    pointers. Need to check that $Config{ptrsize} == 8...

  • Chris Marshall
    Chris Marshall

    We also need to change dims and dimincs from PDL_long to the appropriate 64bit value.

  • Chris Marshall
    Chris Marshall

    • summary: support for piddles >2 GiB --> support for piddles >2**31 elements
  • Chris Marshall
    Chris Marshall

    Since the number of elements in a piddle are represented
    as an int, the maximum number of elements possible is
    ~2**31. Assuming 64-bit pointers and addressing, it should
    be possible to increase the maximum size of a piddle by
    changing the index type from int/long to some int64 type.

  • Chris Marshall
    Chris Marshall

    An additional note: supporting PDLs larger than 2GiB with less
    than 2**31 elements is a much easier fix than the true 64bit
    PDL which would require changing all the index types from ints
    to int64 types.

  • Chris Marshall
    Chris Marshall

    • priority: 5 --> 7
1 2 > >> (Page 1 of 2)