From: Toni C. <tca...@to...> - 2007-05-08 09:08:23
|
What is interesting is the assertion that sets/maps outperform Judy if the entire structure fits in the (presumably L2) cache but that this performance relationship is reversed once it doesn't fit. Seems a bit odd, but then again this is not about comparing the underlying Judy1 concept in isolation, but rather about hash tables so there must be other factors coming into play here? Pentium 4+ have cache line of 64 bytes for L1 and 128 bytes for L2 -- this second value is also the minimum system bus transfer size I think so for two threads, read/write areas should be spaced at least 128 bytes apart (and 64-byte aligned). As an aside, I wonder if any of the SSE instructions could help with the bit processing required by Judy? Regards, Toni. ---- Toni Cassisi Tovica Ltd http://www.tovica.com Tel: +44 (0) 7971 874 054 IM: AOL/Yahoo/MSN: tcassisi > -----Original Message----- > From: jud...@li... [mailto:judy-devel- > bo...@li...] On Behalf Of zo...@zo... > Sent: 07 May 2007 19:04 > To: jud...@li... > Subject: benchmark vs. benchmark > > > Dear Folks: > > Thanks for the Judy Tree data structure! It has been an excellent tool > to use > in some computational linguistics research where we need to fiddle with > large > datasets of small objects (words, pairs of words, triples of words, > frequencies, etc). > > Have you seen this? > > http://citeseer.ist.psu.edu/fritchie03study.html > > Judy comes out very nicely in some benchmarks. > > > Contrasted with the one that has already been discussed here: > > http://www.nothings.org/computer/judy/ > > Where Judy barely holds her own against Sean Barrett's basic hash > table. > > > My guess is that Judy does so much better in the fritchie03study than > in the > Sean Barrett (nothings.org) because fritchie was using a CPU with a 64- > byte > cache line (for which Judy was designed), and Barrett was using a > Pentium III > which I believe has a 32-byte cache line. > > This suggests that Judy is better performing on the common machines > today than > it was on the common machines a few years ago. > > My experience was on an Athlon-64 (details below), and as I said it was > very > pleasant. The same source code also ran on lesser, 32-bit machines, > but it > didn't hum along as nicely. > > > Regards, > > Zooko > > MAIN yumyum:~$ cat /proc/cpuinfo > vendor_id : AuthenticAMD > cpu family : 15 > model : 4 > model name : AMD Athlon(tm) 64 Processor 3700+ > stepping : 10 > cpu MHz : 2403.176 > cache size : 1024 KB > fpu : yes > fpu_exception : yes > cpuid level : 1 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm > 3dnowext 3dnow up > bogomips : 4809.50 > TLB size : 1024 4K pages > clflush size : 64 > cache_alignment : 64 > address sizes : 40 bits physical, 48 bits virtual > power management: ts fid vid ttp > > > ----------------------------------------------------------------------- > -- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > Judy-devel mailing list > Jud...@li... > https://lists.sourceforge.net/lists/listinfo/judy-devel |