Re: [Datadraw-user] Red-black trees beat treaps
Brought to you by:
smilindog2000
|
From: <dat...@li...> - 2008-05-26 16:50:28
|
Hi, Kai.
I inspected the assembly-code carefully, and found several issues.
First, the reason the #define version was so much faster is that it
didn't cast the return value to int. Since my keys were unsigned ints,
the right key minus left key was always positive, which led to an
ordered tree rather than random. When I fixed it, and improved some
cache efficiency issues, DataDraw now beats STDCXX at 29 seconds
compared to 30, and beats STL by two seconds. The DataDraw version uses
about 60% as much memory.
For ordered traversals, however, DataDraw did quite a bit better. At 10
traversals of 5 million nodes, DataDraw came in at 6.35 seconds vs 10.35
for STL. The reason for this is cache efficiency. I organized the data
to group only left child, right child, and the key as a single
structure, so every read of a node's key would load a cache line that
also contained the node's left and right child. In addition, it fills
the cache line with only left, right, and key values of nearby nodes,
causing in-order traversal to really speed up. The STL version loads
mostly data unrelated to the loop into cache, and all the pointers are
2X as big, making cache misses much more common.
One thing really hit home: speed is now all about cache performance.
The DataDraw inner loop is 9 assembly instructions due to poor
optimization from the GNU C compiler, while the STL version is only 6.
However the DataDraw version wins big due to better cache organization.
I don't know if inline functions in GNU C are as good as #defines. The
last time I checked, about 10 years ago, they slowed down my code by
10-20%. It's probably time to check again, now that all C compilers I
care about support inline.
Since cache organization is soooo important, I'm adding cache hints to
the .dd format. For this example, I think it will look like:
class Node
Node parent
Node left
Node right
uint32 key
bool color
cache_together left right key
This will cause a these three fields to be defined in a struct of their
own, and there will be an array of these structures rather than an array
for each separate field. It should be a nice hint to introduce in
algorithms after profiling.
Best regards,
Bill
On Fri, 2008-05-23 at 18:11 +0200, dat...@li...
wrote:
> Hi there :)
>
> it would be interesting to inspect the assembly-code of #define and the static code. I guess the #define really gets inlined and the static is an extra function call.... inspecting code is rather easy in visual studio, but I know you're working in a unix-environment.
>
> Optimising code is always a great challenge. But nowadays optimising memory-layout is much better than optimising only functions (ok, ok, using the right algorithm for the problem is the best ;)
>
> so congratulations for you fast routines! I'm really impressed.
> kai
>
> ----- original Nachricht --------
>
> Betreff: Re: [Datadraw-user] Red-black trees beat treaps
> Gesendet: Do, 22. Mai 2008
> Von: dat...@li...
>
> > Hi, Kai.
> >
> > For some reason, every compiler I've tested has trouble optimizing when
> > using inline functions instead of #defines. That's why #defines are
> > still in use in datadraw, though for debugging, we undefine most of them
> > and provide real functions instead.
> >
> > I tested multiset from stdcxx. You'll be happy to know it beats the STL
> > library. For 5 million nodes, my random insert/remove benchmark takes
> > 30 seconds, while STL takes 31. However, DataDraw does it in 8, while
> > doing more work (it inserts real objects with key values, rather than
> > just ints), and uses only about 60% as much memory on my 64-bit machine.
> > I've attached the files I used to do the benchmark. The DataDraw
> > benchmarks are signed in under trunk/examples/redblack_benchmarks
> >
> > I just love winning speed benchmarks :-)
> >
> > Best regards,
> > Bill
> >
> > On Thu, 2008-05-22 at 13:42 -0400, dat...@li...
> > wrote:
> > > I am surprise about the static versus #define because I though static
> > > could be interpreted as inline by the optimizer.
> > >
> > > Great to know.
> > >
> > > Regards
> > >
> > > On Thu, May 22, 2008 at 12:38 PM,
> > > <dat...@li...> wrote:
> > > Hi, Kai.
> > >
> > > Thanks for the link. I've downloaded it and hope to compare
> > > to stdcxx
> > > soon. In the meantime, I compared DataDraw ordered_lists to
> > > the g++ STL
> > > multiset. With just integers, the STL code takes 33 seconds
> > > for 5
> > > million creates, 5 million random insert delete pairs, and 5
> > > million
> > > deletes. We use to run in 45 seconds... I found out why we
> > > were slower,
> > > and fixed it. DataDraw now runs in 8 seconds!
> > >
> > > Best regards,
> > > Bill
> > >
> > > P.S. The reason the datadraw implementation was slow is a bit
> > > scary.
> > > Static inline functions DO NOT run as fast as #defines! I
> > > converted the
> > > compare function to #define base, and the speed when from 43
> > > seconds to
> > > 8!
> > >
> > > On Thu, 2008-05-22 at 11:59 +0200,
> > > dat...@li...
> > >
> > > wrote:
> > > > Hi Guys,
> > > >
> > > > I've heard that the stdcxx-code should be really fast, but
> > > I've not measured the performance against the
> > > wikipedia-entry.... what I've done is to measure the
> > > performance of the stl-, crisscross-implementation and some
> > > other and found that the stdcxx was the fastest for me. you
> > > can find the code here: http://stdcxx.apache.org/
> > > >
> > > > regards and happy coding :)
> > > > kai
> > > >
> > > > ----- original Nachricht --------
> > > >
> > > > Betreff: Re: [Datadraw-user] Red-black trees beat treaps
> > > > Gesendet: Mi, 21. Mai 2008
> > > > Von: dat...@li...
> > > >
> > > > > It's done. We now have red-black tree implementations for
> > > ordered_list
> > > > > relationships. This is the fastest code I've found on the
> > > Internet for
> > > > > ordered lists after both Richard and I did a fairly
> > > lengthy search.
> > > > > Thanks for the help, Richard!
> > > > >
> > > > > Bill
> > > > >
> > > > > On Wed, 2008-05-21 at 14:43 -0400,
> > > dat...@li...
> > > > > wrote:
> > > > > > ?Hi.
> > > > > >
> > > > > > I hand-coded both treap and black-red tree benchmarks.
> > > The red-black
> > > > > > code I used is the code from the wikipedia.org page.
> > > For random insert
> > > > > > and delete of 10 million nodes, treaps took 63 seconds,
> > > while red-black
> > > > > > took 43 seconds. Treaps actually trounced red-black
> > > trees when used as
> > > > > > a fifo (deleting only smallest, inserting only largest),
> > > but I think
> > > > > > random is the more important case for ordered lists.
> > > > > >
> > > > > > I'm going to convert over to wikipedia style red-black
> > > trees if there
> > > > > > are no objections. You can find my benchmarks in
> > > > > > examples/treap_benchark and examples/redblack_benchmark.
> > > > > >
> > > > > > Regards,
> > > > > > Bill
> > > > > >
> > > > > >
> > > > > >
> > >
> > -------------------------------------------------------------------------
> > > > > > This SF.net email is sponsored by: Microsoft
> > > > > > Defy all challenges. Microsoft(R) Visual Studio 2008.
> > > > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > > > > > _______________________________________________
> > > > > > Datadraw-user mailing list
> > > > > > Dat...@li...
> > > > > >
> > > https://lists.sourceforge.net/lists/listinfo/datadraw-user
> > > > >
> > > > >
> > > > >
> > >
> > -------------------------------------------------------------------------
> > > > > This SF.net email is sponsored by: Microsoft
> > > > > Defy all challenges. Microsoft(R) Visual Studio 2008.
> > > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > > > > _______________________________________________
> > > > > Datadraw-user mailing list
> > > > > Dat...@li...
> > > > > https://lists.sourceforge.net/lists/listinfo/datadraw-user
> > > > >
> > > >
> > > > --- original Nachricht Ende ----
> > > >
> > > >
> > > >
> > >
> > -------------------------------------------------------------------------
> > > > This SF.net email is sponsored by: Microsoft
> > > > Defy all challenges. Microsoft(R) Visual Studio 2008.
> > > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > > > _______________________________________________
> > > > Datadraw-user mailing list
> > > > Dat...@li...
> > > > https://lists.sourceforge.net/lists/listinfo/datadraw-user
> > >
> > >
> > >
> > -------------------------------------------------------------------------
> > > This SF.net email is sponsored by: Microsoft
> > > Defy all challenges. Microsoft(R) Visual Studio 2008.
> > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > > _______________________________________________
> > > Datadraw-user mailing list
> > > Dat...@li...
> > > https://lists.sourceforge.net/lists/listinfo/datadraw-user
> > >
> > >
> > >
> > >
> > > --
> > > +1 514-518-8172
> > > -------------------------------------------------------------------------
> > > This SF.net email is sponsored by: Microsoft
> > > Defy all challenges. Microsoft(R) Visual Studio 2008.
> > > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> > > _______________________________________________ Datadraw-user mailing list
> > Dat...@li...
> > https://lists.sourceforge.net/lists/listinfo/datadraw-user
> >
>
> --- original Nachricht Ende ----
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Datadraw-user mailing list
> Dat...@li...
> https://lists.sourceforge.net/lists/listinfo/datadraw-user
|