From: Alan S. <aj...@fr...> - 2014-02-24 05:25:06
|
Turns out I kept around, but forgot, two different libJudy source trees for myself; one the early official SourceForge one, and another a slightly earlier initial snapshot. (Didn't have to go to a CDROM at all.) In the latter I found these files of interest (per the HP internal source tree layout): doc/ext/JudyNL_3x.htm doc/int/JudyNL_issues src/JudyNL/README src/JudyNL/JudyNL.h src/JudyNL/JudyNL.c test/manual/JudyNLCheck.c So I built a little tarchive, just 71 Kb, no point in even compressing it. If anyone wants a copy, email to me only and I'll send it back to you as an attachment. But first note the text appended below, which is the majority of the issues file (from Dec 2001) mentioned above. Bear in mind that JudyNL is "just" an array-of-array... of JudyL, where the top level keys are key lengths, and each successive subarray carries four successive bytes of the keys. Cheers, Alan Silverstein ------------ ...Doug was not happy about several things, if I recall right: * JudyNL sorts indexes by length first rather than lexicographically (left to right). I have a paper written that discusses how we might support variable-size keys with lexicographic sorting (available on request), but have not implemented it, and it's painful to look at the data structs required. Dunno how efficiently it would operate either. * JudyNL insists that index lengths be whole words, though for HPUX libc compatibility the length is expressed in bytes (which must be a multiple of the word size). Now, you could modify the JudyNL code to express length however you like (bits, bytes, words, nibbles...), the only catch (additional work not now performed) is needing to null-pad indexes that are not whole words to ensure single unique insertion in the meta-trie, since the underlying trie nodes = JudyL arrays operate on words. JudySL already includes this type of padding, but JudyNL has no need for it. * JudyNL is possibly overkill for a given application because it allows a variety of index sizes in a single array, at a cost in complexity and overhead, when perhaps any given user or application might want only one or a set of single-size index arrays (fixed-depth trees), like Judy2L. * Unlike JudySL, JudyNL has no provision internally for "shortcut leaves" at this time, meaning a unique suffix still descends M (< N) additional levels of meta-trie, equivalent to single-child branch nodes in a simple trie. Not too big a deal if N is relatively small, whereas JudySL strings are more likely (?) to be quite long and have long unique suffixes. * Due to not shipping it, JudyNL code is now out of date with respect to error handling and also has no automated regression tests. * And ultimately, no user had requested this feature, there was no "pull", unlike JudySL. We created JudyNL on my initiative, after I recognized the difference between length-terminated and length-associated variable-size indexes, I decided to implement both (and internally they are quite different). |