On Mon, Dec 20, 2010 at 2:02 PM, john skaller <skaller@users.sourceforge.net> wrote:

On 21/12/2010, at 8:17 AM, Jim Lloyd wrote:

> I agree that malloc/free has excellent performance and that it was a good choice to have Judy use malloc as opposed to some other memory allocator using a single heap. It is only because we want multiple pools that we want something other than malloc.

Is your app multi-threaded? If not, you can use a global variable to store the
allocator data: very low cost compared to passing the allocator around.
[You set the variable before each operation]

Yes, our app is multi-threaded . Actually, we have several different applications that share common code and Judy is integral to all of these applications. Some of these apps are batch and some are long-lived daemons. All of them scale well up to 16 cores.

The same method would work for multi-threaded apps with a lock to protect
the allocator: pretty ugly and much higher performance cost.

No, that's not feasible. We can have literally millions of Judy arrays in one process. One global lock shared by all Judy arrays would kill the performance.

> We're prepared to sacrifice a bit of performance due to the increase in complexity. Even if the underlying pool allocator is as good as malloc/free, there will be a small performance cost because a reference to the allocator will have to be stored in the root record for a Judy array and then passed down as a function argument in the various operations that traverse the internal tree/trie data structures. We believe the benefits will outweigh the cost.

For you. The challenge would be to modify the source so this overhead is optional.
Say, by having two completely distinct versions of the source.. (Arggg not another
MACRO please .. :) The current build technology is very bad so lets not make
that even worse with more generators: we had to throw that stuff out because
it is too hard to manage.

I agree this is a challenge.

My primary application for Judy is in a garbage collector, which maps pointers
to information about the allocated objects. Judy is also available "in language" as
a data structure to hold integers (can't use pointers yet because it requires the gc
know all about Judy, since Judy hides addresses).

In this context, a small extra cost adding a new address to the Judy data structures
would be acceptable: allocation is already expensive, so an extra 5-10% overhead
would be OK.

That's my assumption. I'd expect the added overhead to be under 1% when amortized over the total cost (i.e. including the cost of the allocations). There should be no extra cost to the Judy*Get operations since these don't ever allocate. A Judy*Insert operation will be marginally more expensive, even in the case where the key already exists and no allocation is necessary, due to an additional argument on each stack frame. This is the worst case for cost, because the other case of a Judy*Insert where the key doesn't exist will have cost dominated by the cost of the memory allocation.

The cost on a collection is another story though, i.e. the cost of *removing* values
from a Judy array matters here because all the removals are done "at once" during
the reaping phase of the collector and so are observable by the programmer
(think games and framerates).

Do you need to de-allocate as well? Often pools are used to avoid deallocation
costs: you just de-allocate the whole pool.

Yes, de-allocation is one of the big motivating factors for us, primarily in our long-lived daemon applications.

john skaller