Menu

#4663 Bucket table for undeclared array keeps growing beyond 32768 buckets

None
closed
nobody
5
2026-04-19
2026-01-25
No

The implementation of undeclared arrays makes use of a bucket table. There is supposed to be one bucket per possible hash code, but HASHER returns a value modulo 32768, so there are only that many distinct values. However, the bucket table is reallocated as the number of items grows, and keeps growing past 32768 buckets. Any buckets past that number cannot be used and just remain empty.

You can see the empty buckets by something like

kill (foo);
for i thru 2^20 do foo[i]: i;

and then

to_lisp ();
(defvar a (get (mget '$foo 'hashar) 'array))
(print (aref a 0))
(dotimes (i 1048576) (print (length (aref a (+ 3 i)))))

prints nonzero values for a while (presumably 32768 of them, I didn't count) and then just 0 thereafter.

Probably just need to forgo the reallocation when the bucket table is already at the maximum possible size. Might also want to bump up the maximum value of the hash code; I see that HASHER masks the return value with 32767 so maybe all that's needed is to choose a bigger mask, although there may be wrinkles that need to be ironed out.

Discussion

  • David Scherfgen

    David Scherfgen - 2026-04-19
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -2,8 +2,8 @@
    
     You can see the empty buckets by something like
     ```
    -kill (aa);
    -for i thru 2^20 do aa[i]: i;
    +kill (foo);
    +for i thru 2^20 do foo[i]: i;
     ```
     and then
     ```
    
    • status: open --> closed
     
  • David Scherfgen

    David Scherfgen - 2026-04-19

    Good catch, that's just a waste of memory!
    I changed aa to foo to make the example work.

    Note that there's a comment in HASHER that says:

    please don't change this code or you're liable to break SAVE files.

    Therefore, for now, I only extracted the 32768 into a new constant +hasher-mod+ and updated the resizing logic to never resize beyond this; see commit [cd4f49].

     

    Related

    Commit: [cd4f49]


Log in to post a comment.

MongoDB Logo MongoDB