On 11/16/2012 6:47 PM, john skaller
In the first question, I was asking if anyone can think of a more
efficient way to do a prefix search using the current Judy
functions--not asking for something more efficient than the
On 17/11/2012, at 10:43 AM, Tim Margheim wrote:
One of the most frequent operations I need to perform is a prefix
search, i.e. "return the first element that starts with Index, or return
null if nothing does".
I'm currently doing this by calling JudySLNext(). If it returns a
non-null pointer, but the new Index doesn't start with the prefix, I
still return null.
1.) Does anyone know of a more efficient way to do that with the current
set of Judy functions?
The searching is close to optimal. The only cost is copying the unwanted
key, i.e. descending into the Judy tree copying characters as we go,
when we don't actually want that key.
It's hard to see how to improve on "JudySLNext() followed by an
strncmp() on the returned key", but I thought it couldn't hurt to
Hmm, I might have to look into that--my current application isn't
affected much by the insertion performance, but it could matter for
another application that I have in mind. And I could make do with
However subsequent insertion is expensive since the search
must be repeated, even when the position for a new key is known.
Someone made a "cursor" based version of Judy that fixed this,
however that code is not re-entrant and therefore unacceptable
to me at least, because their modified Judy only supported
However, if 0 is an invalid value for someone's JudyTree, then they
don't need to do a search-then-insert-if-it-failed. You can replace
the initial find with an insertion. If the pointed-to-value
returned by the insertion is 0, then you know the key wasn't already
in the tree. If the pointed-to-value is anything other than 0, then
you know the key was already in the tree. So for those
kinds of purposes, you don't need a cursor.
In fact, even if 0 is a valid value, you could get around it
if your application would allow you to store "value+1" instead of
"value". The only cost is an addition/subtraction on storage &
retrieval of the value, plus the slight reduction in the range of
values you can store.
Also, if you modified the Judy insert function to accept a pointer
to a bool, then the bool could return whether or not the insertion
created a new element or returned an already-existing one.
2.) Doug, have you thought about making a search function that will give
up as soon as it determines that there are no matches beginning with
Index? What do you think about including it in one of the upcoming
A more consistent function would be
Good point. And it would be simple enough to write an inline
prefix-search function that auto-generates the upper-bound string
and wraps a call to find-between. (Though if you have to call the
wrapper function multiple times, it would involve multiple
allocations of the upper-bound buffer, so you might want to use the
find-between function directly in that case. Or make a version of
the wrapper where the upper-bound buffer is passed in and then
populated by the wrapper.)
On the other hand, you'd also want to carefully compare the
performance of a find-between with the performance of a prefix
search. It might be that the find-between requires some additional
instructions. Still, find-between would certainly perform better
than running Next and then checking if the returned key is outside