Menu

#23 JudySL and JudyHS efficiency

open
nobody
None
5
2022-05-29
2022-05-29
No

Hi. Thank you for Judy!

I am the founder of Netdata, a popular open-source monitoring platform. We use JudyHS and JudyL in our database design for in-memory indexing and we are very happy with them.

Lately, we tried to use JudySL and JudyHS for different purposes and I have identified a couple of issues I would like to share with you.

JudySL
In JudySL when walking through the index with JudySLFirst(), JudySLNext(), JudySLPrev(), JudySLLast(), the Judy library copies to the user supplied Index, the key of the index. This means that for just traversing an array of 1 billion points, 1 billion strcpy()/memcpy() have to be done. This limits significantly the possible uses of JudySL, making it extremely slow when traversing the whole array. It would be better to provide an ephemeral pointer to an internal string (if such a thing exists inside JudySL and the string is not decomposed).

JudyHS
In JudyHS, the function JudyHSDel() returns an int instead of a Word_t (Value). If the Value set in the index is used by the caller to maintain other allocations that also need to be cleaned up when an item is removed, the caller is left with just one option: call JudyHSGet() to get the Value of the item to be deleted, before each JudyHSDel(). This makes deletions the weakest point of JudyHS, as they become slower than any other of its operations.

In JudyHS again, the library does not provide traversing functions like JudyHSFirst(), JudyHSNext(), etc, because the hash table is not sorted. There are many use cases where sorting is irrelevant, but traversing is very important. For these use cases JudyHS can only be used together with a double linked list, increasing significantly the memory footprint of the solution.

Memory Used
JudyHSFreeArray() returns the number of bytes in released. This means that the library does keep track of the memory it has allocated. It would be very helpful to be able to find this information without destroying the array. Possibly a JudyXMemoryUsed() ?

These are my observations. Thank you again for Judy! You really rock!

Discussion


Log in to post a comment.

MongoDB Logo MongoDB