Just Launched: You can now import projects and releases from Google Code onto SourceForge
We are excited to release new functionality to enable a 1-click import from Google Code onto the Allura platform on SourceForge. You can import tickets, wikis, source, releases, and more with a few simple steps. Read More
Just some general questions about judy and how well it works.
As I understand, array members that are uninitialized (have no value assigned,) take up no space, is this the case?
Can I define and use a custom data type in a judy array?
What is the size of a judy array compared to the size of a regular array (this is probably best expressed as a percent(of course if array members that are uninitialized take up no space the answer can be variable))?
From: Alan Silverstein <ajs@fr...> - 2013-05-24 15:54:38
Some quick answers from me (libJudy codeveloper >10 years ago), and Doug
can follow up.
> As I understand, array members that are uninitialized (have no value
> assigned,) take up no space, is this the case?
Almost but not exactly. Depending on the defined indexes/keys, internal
data storage changes shape trying to achieve the best space
"compression" (but doing it quickly and very locally compared with
by-population tree models). In Judy1 or JudyL there can be "vacant"
slots for unused elements, when it costs less than working hard to
switch data formats to avoid them.
However to a first order of approximation you can think of it the way
you said it. In particular large unused expanses (such as "all indexes
between 12,047 and 115,399") are very cheap, sometimes zero cost.
> Can I define and use a custom data type in a judy array?
Not within, no. All JudyL or JudySL gives you is a one-word "value
area" that you define any way you like -- such as a pointer to a custom
data object that you malloc()/free() yourself.
> What is the size of a judy array compared to the size of a regular
> array (this is probably best expressed as a percent(of course if array
> members that are uninitialized take up no space the answer can be
Yeah. Sparse arrays in general (of all kinds) are widely used because
they waste little space on empty slots. The bytes-per-index metric is a
good way to study it, and Judy does very very well compared to other
models, even in corner cases. The percentage of which you speak is highly
variable. In the worst case any sparse array, even libJudy, can be >100%
due to overhead when the array is (nearly) full. But mostly with a very
sparse array, and a large expanse (like 2^32 or 2^64), the ratio is tiny
comparing the data structure to a flat array.