Thread: found JudyNL sources!

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Turns out I kept around, but forgot, two different libJudy source trees
for myself; one the early official SourceForge one, and another a
slightly earlier initial snapshot.  (Didn't have to go to a CDROM at
all.)  In the latter I found these files of interest (per the HP
internal source tree layout):

    doc/ext/JudyNL_3x.htm
    doc/int/JudyNL_issues
    src/JudyNL/README
    src/JudyNL/JudyNL.h
    src/JudyNL/JudyNL.c
    test/manual/JudyNLCheck.c

So I built a little tarchive, just 71 Kb, no point in even compressing
it.  If anyone wants a copy, email to me only and I'll send it back to
you as an attachment.  But first note the text appended below, which is
the majority of the issues file (from Dec 2001) mentioned above.

Bear in mind that JudyNL is "just" an array-of-array... of JudyL, where
the top level keys are key lengths, and each successive subarray carries
four successive bytes of the keys.

Cheers,
Alan Silverstein

------------

...Doug was not happy about several things, if I recall right:

* JudyNL sorts indexes by length first rather than lexicographically
  (left to right).  I have a paper written that discusses how we might
  support variable-size keys with lexicographic sorting (available on
  request), but have not implemented it, and it's painful to look at the
  data structs required.  Dunno how efficiently it would operate either.

* JudyNL insists that index lengths be whole words, though for HPUX
  libc compatibility the length is expressed in bytes (which must be a
  multiple of the word size).  Now, you could modify the JudyNL code to
  express length however you like (bits, bytes, words, nibbles...), the
  only catch (additional work not now performed) is needing to null-pad
  indexes that are not whole words to ensure single unique insertion in
  the meta-trie, since the underlying trie nodes = JudyL arrays operate
  on words.  JudySL already includes this type of padding, but JudyNL
  has no need for it.

* JudyNL is possibly overkill for a given application because it allows
  a variety of index sizes in a single array, at a cost in complexity
  and overhead, when perhaps any given user or application might want
  only one or a set of single-size index arrays (fixed-depth trees),
  like Judy2L.

* Unlike JudySL, JudyNL has no provision internally for "shortcut
  leaves" at this time, meaning a unique suffix still descends M (< N)
  additional levels of meta-trie, equivalent to single-child branch
  nodes in a simple trie.  Not too big a deal if N is relatively small,
  whereas JudySL strings are more likely (?) to be quite long and have
  long unique suffixes.

* Due to not shipping it, JudyNL code is now out of date with respect to
  error handling and also has no automated regression tests.

* And ultimately, no user had requested this feature, there was no
  "pull", unlike JudySL.  We created JudyNL on my initiative, after I
  recognized the difference between length-terminated and
  length-associated variable-size indexes, I decided to implement both
  (and internally they are quite different).

Thread: found JudyNL sources!

judy-devel