#277 array_displace_check index uintL->uintV

lisp error
closed-postponed
Bruno Haible
clisp (525)
5
2006-04-22
2005-10-03
Sam Steingold
No

Bruno, why is the index argument to array_displace_check
uintL and unitV?
is this a bug?

Discussion

  • Bruno Haible
    Bruno Haible
    2006-04-18

    • status: open --> closed-works-for-me
     
  • Bruno Haible
    Bruno Haible
    2006-04-18

    Logged In: YES
    user_id=5923

    Array dimensions are stored as an uintL.
    array-dimension-limit = 2^32-1.
    Therefore all array indices and sizes are 32-bit.
    Care must be used only when a Lisp integer is used as an
    array index,
    to avoid _implicit_ uintV -> uintL conversions before the range
    check.

    An example how it's done: In function ROW-MAJOR-AREF:
    if (!posfixnump(STACK_0))
    fehler_index_type(array);
    var uintV indexv = posfixnum_to_V(STACK_0);
    if (indexv >= array_total_size(array)) /* index must be smaller
    than size */
    fehler_index_range(array,array_total_size(array));
    /* Only after indexv is known to be < size, we can safely cast
    it to uintL: */
    var uintL index = indexv;

     
  • Sam Steingold
    Sam Steingold
    2006-04-18

    Logged In: YES
    user_id=5735

    32 bits is far too little for the array limit.
    this is a 64-bit world now.
    there should be a way to address the whole of RAM,
    be it 16GB or more.

     
  • Sam Steingold
    Sam Steingold
    2006-04-18

    • status: closed-works-for-me --> open
     
  • Jörg Höhle
    Jörg Höhle
    2006-04-20

    Logged In: YES
    user_id=377168

    >32 bits is far too little for the array limit.
    >this is a 64-bit world now.
    This remembers me of tradeoff found in many databases:
    Is it ok to use a 64bit array index when only very few if
    any indices in an application will need so many bits?
    Why is there a fixnum type, when all integers could be
    represented using the bignum data structure?
    Databases historically solve this using distinct types.
    CLISP could do something similar, introducing a new array
    type. But that would, again, cause code bloat and likely
    more untested code.
    For the near future, I'd prefer posfixnum_to_V code be
    reviewed. I'm sure there are bugs left

     
  • Bruno Haible
    Bruno Haible
    2006-04-22

    • status: open --> closed-postponed
     
  • Bruno Haible
    Bruno Haible
    2006-04-22

    Logged In: YES
    user_id=5923

    > 32 bits is far too little for the array limit.
    > this is a 64-bit world now.

    Please mention a concrete, real use case that needs arrays
    with more than 2^32 elements in a row. As far as I'm aware,
    in Lisp it's commonplace to create data structures using many
    objects, rather than huge arrays.

    The machines I have access to have 512 MB RAM or less.
    A year ago, LispWorks was still sold with an array dimension
    limit of 2^16. This indicates that "a 64-bit world" is still a few
    years away.

    I'm closing this bug as "Postponed". Please reopen it when
    the 64-bit world has arrived.

     
  • Sam Steingold
    Sam Steingold
    2006-04-24

    Logged In: YES
    user_id=5735

    you are falling into the old fallacy of assuming that your
    use pattern is everyone's use pattern.

    remember a couple of years ago I needed multi-gig images and
    you had to fix that functionality in a hurry (thanks for
    helping me then!)

    as far as I am aware, lisp philosophy has been to avoid
    arbitrary limits on functionality (watch seamless
    integration of bignums and fixnums) - and bot being able to
    create huge arrays that is not a good idea.

    limitations of other lisps is not a good reason to cripple
    clisp.

     
  • Jörg Höhle
    Jörg Höhle
    2006-04-25

    Logged In: YES
    user_id=377168

    User subclassable array types would solve this issue and
    provide other valuable extensions.
    E.g. I've used the extendable SEQUENCE types to provide a
    bridge to foreign arrays (unfinished code), so POSITION,
    FIND and other sequence functions work on foreign array
    descriptors.
    It's dog slow (as all sequence functions), but could have
    its uses. Better than sequence would IMHO be integration
    with array functions (e.g. AREF on the foreign thing).

    I mention that because I could believe (for the near future)
    that if you allocate a >32bit array, then you don't want GC
    to move that around. So malloc() is fine with it, so a
    foreign<->array bridge would be enough for many needs

     
  • Sam Steingold
    Sam Steingold
    2006-04-25

    Logged In: YES
    user_id=5735

    actually, this is the use case raised on clisp mailing lists
    a few times in the past:
    people want to use the normal lisp array functions on
    a huge mmaped file.
    they also want this access to be fast, i.e.,
    without the FFI overhead.

     
  • Jörg Höhle
    Jörg Höhle
    2006-05-05

    Logged In: YES
    user_id=377168

    I reviewed the bits in array.d which I thought were
    problematic. They are not. I forgot that the C compiler
    would DRT with a uint64 < uint32 comparison.

    2. It's not trivial to add 64bit AREF access to a new
    #<FOREIGN-ARRAY> type without touching the others, since
    all C functions e.g. subscripts_to_index(), vector_length()
    use uintL.
    Actually, the integration work for such an addition is
    independent on array-dimension-limit.

    3. About "fast". The SEQUENCE functions in CLISP are
    really generic and not fast at all anyway.

     
  • Sam Steingold
    Sam Steingold
    2006-05-05

    Logged In: YES
    user_id=5735

    aref is fast enough and it does not cons.
    64-bit access is indeed orthogonal to sequence functions
    speeddup, but there is not reason not to do both.

    64-bit computers are so common now - and will be even more
    common soon - that any argument against doing as much as
    possible in 64 bits reminds me of Tannenbaums criticism of
    Linus reliance on i386 as being elitist and out of reach of
    an average student user.