Hi Simen:

First, if I understand your email, I think you could solve the problem a little

differently. Do J1N() on both arrays at the same time and advance the one

that returns the lessor Index. All the data you need can be extracted when

the J1N's return different Indexes. This will only require only one access

on each data element in both arrays. For arrays of very large populations,

J1N's are very fast because typically most of the data is in the cache from

the last access (J1N).

Second, I do not understand your statement:

> However, it seems to scale not so well when the largest indicex

present grows.

If in your example that the number "20" were changed to 1234567890,

I think the performance would be about the same because Judy1

does not store 0's (missing Indexes) in the array.

I can answer more of your questions if you like. I am surprised that Judy1

is used for this "small" of problem. I am use to thinking in 100's of million

of Indexes.

Judy1 will store any combination of Indexes, in any order up to the 4 billion

max while using less than 1GB of RAM.

Doug

Doug Baskins <dougbaskins@yahoo.com>

----- Original Message ----

From: Simen Kvaal <simen.kvaal@gmail.com>

To: judy-devel@lists.sourceforge.net

Sent: Friday, May 16, 2008 9:59:17 PM

Subject: Judy1

Hi!

I came across Judy1 yesterday and have incorporated it into my code,

and it works wonderfully. My code heavily relies on a fast and

memory-efficient way to represent a sparse pattern of bits -- hence,

Judy1!

To be more specific, my code repeatedly compares two variable patterns

(Judy1 arrays) A and B, both containing a constant number of set bits,

to find differing bit positions. For example, if

A = (1, 4, 5, 10, 15)

B = (1, 5, 6, 10, 20)

Then i need the output d(A,B) = (4, 15), d(B,A) = (6, 20) and also

j(A,B) = (2, 5) and j(B,A) = (3,5), where j(A,B) are the number in the

sequence of the differing bits in A compared to B.

What would be the most efficient way to do this? Would it depend on

the density of the array? Typically, I have ~1000 indices and ~5 bits

set.

My present code traverses first A (using J1F and J1N on A) and using

J1T on B for each result. The result is then stored. Then, it repeats

the process on B.

This is fast; about as fast as using bit-wise manipulations on

unsigned integers, which was my old representation of the patterns.

However, it seems to scale not so well when the largest indicex

present grows.

Thanks in advance for any advice!

Regards,

Simen Kvaal.

--

----- Simen Kvaal -- Ph.D student in Physics/Applied Maths -----

Centre of Mathematics for Applications, University of Oslo

web: http://folk.uio.no/simenkva/ office: +47 22857708

-------------------------------------------------------------------------

This SF.net email is sponsored by: Microsoft

Defy all challenges. Microsoft(R) Visual Studio 2008.

http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

_______________________________________________

Judy-devel mailing list

Judy-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/judy-devel

From: Simen Kvaal <simen.kvaal@gmail.com>

To: judy-devel@lists.sourceforge.net

Sent: Friday, May 16, 2008 9:59:17 PM

Subject: Judy1

Hi!

I came across Judy1 yesterday and have incorporated it into my code,

and it works wonderfully. My code heavily relies on a fast and

memory-efficient way to represent a sparse pattern of bits -- hence,

Judy1!

To be more specific, my code repeatedly compares two variable patterns

(Judy1 arrays) A and B, both containing a constant number of set bits,

to find differing bit positions. For example, if

A = (1, 4, 5, 10, 15)

B = (1, 5, 6, 10, 20)

Then i need the output d(A,B) = (4, 15), d(B,A) = (6, 20) and also

j(A,B) = (2, 5) and j(B,A) = (3,5), where j(A,B) are the number in the

sequence of the differing bits in A compared to B.

What would be the most efficient way to do this? Would it depend on

the density of the array? Typically, I have ~1000 indices and ~5 bits

set.

My present code traverses first A (using J1F and J1N on A) and using

J1T on B for each result. The result is then stored. Then, it repeats

the process on B.

This is fast; about as fast as using bit-wise manipulations on

unsigned integers, which was my old representation of the patterns.

However, it seems to scale not so well when the largest indicex

present grows.

Thanks in advance for any advice!

Regards,

Simen Kvaal.

--

----- Simen Kvaal -- Ph.D student in Physics/Applied Maths -----

Centre of Mathematics for Applications, University of Oslo

web: http://folk.uio.no/simenkva/ office: +47 22857708

-------------------------------------------------------------------------

This SF.net email is sponsored by: Microsoft

Defy all challenges. Microsoft(R) Visual Studio 2008.

http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

_______________________________________________

Judy-devel mailing list

Judy-devel@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/judy-devel