[Sqlrelay-discussion] Dynamic cursors patch
Brought to you by:
mused
|
From: Cal H. <ca...@fb...> - 2010-05-24 15:35:23
|
My first attempt at this email was caught by the spam filter due to size. Here's some photo bucket links to the graphs. ---------------------------------------------------------------------------------------------------- Just an update -- I've been using this patch on my production systems for about a day now. Seems to be functional, and is a big memory reduction. I also set my initial connections to 0 with a 60 second TTL and a maxsessioncount of 170 so they shut down around every 5 minutes. Before my changes, each connection proc consumed 50-100MB of memory. After this patch and the config change, that dropped to 15-30MB. I believe the maxsessioncount helped a lot as well. For some reason, our page delivery times would steadily increase over the course of 2 hours. My only guess is that there is some kind of leak? It seems that setting the maxessioncount fixed the problem. Here are a few Cacti graphs I use to monitor everything. This is the distribution of average page delivery times. (APD) I switched everything over to SQL Relay right at 9:00am, and did the config change for maxsessioncount at 11:20 and restarted both of my relay servers. You can see that there is a nice steady, linear increase of delivery times, but no other statistic that I monitor shows the same relationship. http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Average_Page_Delivery_Distribution.png During the same time frame the number of server procs increase, but doesn't really follow the same relationship as APD. My only conclusion here is that it's not based on the number of concurrent clients or servers. (I slowly switched over to relay, one server at a time over the course of 40 minutes) http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Connections.png Here are the number of cursors used during that time period. I know it's crazy, but still doens't follow any correlation to delivery times. Note that this is with my patch. Without the dynamic cursors, I couldn't run this for more than 10 minutes before crushing the servers. http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Cursors.png To also show that the amount of traffic isn't responsible, here are the number of queries per second over that timeframe. http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Queries.png So my only conclusion is that there is something in the relay server connection daemon that slowly gets more inefficient under a good amount of load. maxsessioncount is certainly a band-aid for the problem, but it might be something to look into. Has anyone else experienced something like this? Thanks, --Cal On Tue, May 18, 2010 at 6:05 PM, Cal Heldenbrand <ca...@fb...> wrote: > Hi everyone, > > Here's my patch against SQL Relay release 0.41 to implement dynamic > cursors. This will allow you to start up connections with a small number of > cursors, then grow them as needed until a defined maximum is reached. This > is handy in the case of a few pages that might go crazy with the number of > cursors needed. Sometimes it's difficult to track down where the leak is, > so it's nice to have the server take care of this for you. (Without needing > the memory bloat of many cursors across all connections) > > I added 2 new config file parameters to the *instance* tag: > > *maxcursors*: limit the maximum number of cursors to this number. > Defaults to 1300. I'm not sure what other databases are like, but DB2 has a > magic limit of 1326 statement handles. > > *cursors_growby*: When we need to allocate more cursors, add on a group > of this many at a time. (Avoids many realloc() conditions under heavy use) > Defaults to 5. > > The *cursors* parameter still behaves as usual, it starts up X number of > initial cursors per connection. > > I did add one tiny bug fix to this as well -- it seems that the "times new > cursor used" stat wasn't updated. I modified the behavior to increment the > counter when the *client* requests a new cursor. (Even if the server has > already allocated one) > > Please let me know if you find any bugs with this patch. > > Thanks, > > --Cal > |