Thread: [Rubydotnet-developer] Performance
Status: Alpha
Brought to you by:
thomas
From: Tim S. <ti...@ih...> - 2004-10-29 08:31:12
|
I've done some performance tests. - The bridges that SaltyPickle folks wrote, and the one I wrote, use a Hashtable of ids. - Thomas' bridge uses GCHandle. The Hashtable bridges force pointer equality comparison with Object#== rather than the default Object#Equals which can be overridden by subclasses. The reason we do this is that we think "if two objects happen to be the same when they're created, and we give them the same id, then one is mutated the other will also be mutated. That would be bad." Now imagine what happens when you do require 'dotnet' r = DotNet.new('somestring') 1_000_000.times { r.ToUpper } Each string returned by ToUpper() has a different pointer, so is different wrt Object#==. But GetHashCode() returns the same value for each! This makes our Hashtable lookup linear instead of constant. So many collisions. Of course some of these values will be garbage collected as we go along, but my tests showed we have up to several thousand strings in the Hashtable at once. (It cycles small ... thousand .. small etc. as the gc is invoked.) This makes performance baaad. One solution is to recognise that strings (and integers and ...) are not mutable, so do a special case for strings - use Equals in that case. A better solution is to note that the implementation of Equals in standard .NET classes is always == for mutable objects. (I think...) Therefore Equals should always be used by the hashtable. Alternatively, use GCHandle like Thomas did. The nice thing about this is it makes handling garbage collection very easy. Simply have the Ruby free function for the proxy object call GCHandle.Free(). Hashtable with Equals can actually perform about 1.5 times the speed of using GCHandles in the artificial benchmark above. (Where we're getting the same value returned over and over again.) And now for some numbers... The first set of numbers is from my head and may be a wrong: my bridge was getting ~500 calls per second, the other two were a bit faster, maybe 800 or 1500. I made two changes to my bridge: 1) Made some things in my DotNet::Instance initialisation "lazy". (Getting FullName, AssemblyID, creating DotNet::Class.) A big win since the above example was never calling the result of `r.ToUpper'. 2) The Hashtable change. Either to use Equals or GCHandle. And now for the nice numbers... Using Hashtable with Equals: ~86,000 calls per second. Using GCHandle: ~58,000 calls per second. I actually think GCHandle is better because - Simpler to code - This benchmark is in someways the best case for Hashtable with Equals. I suspect GCHandle will perform better on average in real life. |
From: Tim S. <ti...@ih...> - 2004-10-29 09:37:02
|
On Fri, Oct 29, 2004 at 09:30:59PM +1300, Tim Sutherland wrote: [...] > The first set of numbers is from my head and may be a wrong: > my bridge was getting ~500 calls per second, the other two were a bit > faster, maybe 800 or 1500. [...] > And now for the nice numbers... > > Using Hashtable with Equals: ~86,000 calls per second. > Using GCHandle: ~58,000 calls per second. [...] I think Thomas' was around 5000 or so calls per second. It was definitely the best (because of the other bridges' Hashtable problem). But even that bridge can probably be made 10x faster or more pretty easily, as shown by my second set of results. |
From: Tim S. <ti...@ih...> - 2004-10-29 11:11:32
|
On Fri, Oct 29, 2004 at 10:36:45PM +1300, Tim Sutherland wrote: > On Fri, Oct 29, 2004 at 09:30:59PM +1300, Tim Sutherland wrote: > [...] > > The first set of numbers is from my head and may be a wrong: > > my bridge was getting ~500 calls per second, the other two were a bit > > faster, maybe 800 or 1500. > [...] > > And now for the nice numbers... > > > > Using Hashtable with Equals: ~86,000 calls per second. > > Using GCHandle: ~58,000 calls per second. > [...] > > I think Thomas' was around 5000 or so calls per second. It was > definitely the best (because of the other bridges' Hashtable problem). > But even that bridge can probably be made 10x faster or more pretty > easily, as shown by my second set of results. I checked my actual numbers... I ran each program with whatever number of iterations made it run in around 10 seconds. Then I repeated with twice as many iterations, expecting around 20 seconds. (This was not always the case.) My original bridge was getting between 445 and 888 calls per second. (It varied a lot depending on the number of iterations, for gc reasons?) Adding in ``lazy DotNet::Instance initialisation'' actually made it slower, between 258 and 409 calls per second. This surprised me since the things we're avoiding initialising are never actually used by the benchmark. I think it's changing the gc somehow... When I added some debugging to print out around 20,000 lines, the bridge was about twice as fast. Again, gc reasons I think, with the linearity of the hashtable lookups. Changing the Hashtable to use Equals instead of == resulted in 21,000 calls per second. Adding in lazyness gave 47,000. Using GCHandles, but not lazy: 26,000 calls per second Using GCHandle, and also lazy: 60,000 calls per second (So GCHandles are actually faster, contrary to my previous post.) [Hopefully I won't have to followup quickly to this post!] |