> I'd like to know if that's because you want basic sorting of those...
yes, lexicograhically sorting is needed. This is necessary if
for multi-field keys in databases (the null character can occurs within the key).

The sorted by length like in JudyHS function cannot be used.
The big advantage of the judy arrays is the lexicograhically ordering.
Otherwise you can simply use hashing, witch is faster. You must always take this into
consideration. For example the hat-trie (see askitis.com) is faster than the judy-arrays,
but no lexical ordering is supported.

> Storing variable-length, length-associated keys sorted lexicographically
is actually a fairly hard problem.
I don't know the internals of the judy arrays but i think, you can store the common prefixes as actually done and
store the suffixes as variable length array (length as varbyte + the byte array).



Von: Alan Silverstein <ajs@frii.com>
An: schaeferhaus@yahoo.de
CC: judy-devel@lists.sourceforge.net
Gesendet: 22:11 Dienstag, 20.November 2012
Betreff: Re: Prefix-only search?

> For me the 2 major improvements to judy arrays are:  1 - Using Judy
> arrays for non null terminated strings (this is currently a big
> drawback)...

I'd like to know if that's because you want basic sorting of those
strings as a side effect, or something else.  Basic sorting =
lexicographically (no knowledge of multibyte chars) versus
sort-by-length.

As I've mentioned before, I call this the length-terminated (like a C
string) versus length-associated (like a Perl string) problem.  In
general, any variable-sized key/index (chars, bytes, bits, whatever) is
length-associated, that is, no magic termination value inside the
string.  However, we wrote JudySL for C strings only, where NUL
termination exists, hence you get lexicographical sorting for free.

You can always save length-associated keys/indexes/strings efficiently
by making the first level a JudyL array by length (got that?), but this
array-of-array sorts/groups keys/strings by length first.  "Too bad," if
you want lexicographical sorting, use a different algorithm?

Storing variable-length, length-associated keys sorted lexicographically
is actually a fairly hard problem.  You give up some speed and space by
inserted magic "termination information" between the levels of the tree.
I think I wrote a paper about this that might be hiding somewhere on
SourceForge.

Cheers,
Alan Silverstein

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Judy-devel mailing list
Judy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/judy-devel