[Sqlrelay-discussion] Dynamic cursors patch

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

My first attempt at this email was caught by the spam filter due to size.
Here's some photo bucket links to the graphs.

----------------------------------------------------------------------------------------------------

Just an update -- I've been using this patch on my production systems for
about a day now.  Seems to be functional, and is a big memory reduction.  I
also set my initial connections to 0 with a 60 second TTL and a
maxsessioncount of 170 so they shut down around every 5 minutes.

Before my changes, each connection proc consumed 50-100MB of memory.  After
this patch and the config change, that dropped to 15-30MB.

I believe the maxsessioncount helped a lot as well.  For some reason, our
page delivery times would steadily increase over the course of 2 hours.  My
only guess is that there is some kind of leak?  It seems that setting the
maxessioncount fixed the problem.

Here are a few Cacti graphs I use to monitor everything.  This is the
distribution of average page delivery times.  (APD)   I switched everything
over to SQL Relay right at 9:00am, and did the config change for
maxsessioncount at 11:20 and restarted both of my relay servers.  You can
see that there is a nice steady, linear increase of delivery times, but no
other statistic that I monitor shows the same relationship.

http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Average_Page_Delivery_Distribution.png

During the same time frame the number of server procs increase, but doesn't
really follow the same relationship as APD.  My only conclusion here is that
it's not based on the number of concurrent clients or servers.   (I slowly
switched over to relay, one server at a time over the course of 40 minutes)

http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Connections.png

Here are the number of cursors used during that time period.  I know it's
crazy, but still doens't follow any correlation to delivery times.  Note
that this is with my patch.  Without the dynamic cursors, I couldn't run
this for more than 10 minutes before crushing the servers.

http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Cursors.png

To also show that the amount of traffic isn't responsible, here are the
number of queries per second over that timeframe.

http://i1001.photobucket.com/albums/af138/cal_heldenbrand/Queries.png

So my only conclusion is that there is something in the relay server
connection daemon that slowly gets more inefficient under a good amount of
load.  maxsessioncount is certainly a band-aid for the problem, but it might
be something to look into.  Has anyone else experienced something like this?

Thanks,

--Cal

On Tue, May 18, 2010 at 6:05 PM, Cal Heldenbrand <ca...@fb...> wrote:

> Hi everyone,
>
> Here's my patch against SQL Relay release 0.41 to implement dynamic
> cursors.  This will allow you to start up connections with a small number of
> cursors, then grow them as needed until a defined maximum is reached.  This
> is handy in the case of a few pages that might go crazy with the number of
> cursors needed.  Sometimes it's difficult to track down where the leak is,
> so it's nice to have the server take care of this for you.  (Without needing
> the memory bloat of many cursors across all connections)
>
> I added 2 new config file parameters to the *instance* tag:
>
> *maxcursors*:  limit the maximum number of cursors to this number.
> Defaults to 1300.  I'm not sure what other databases are like, but DB2 has a
> magic limit of 1326 statement handles.
>
> *cursors_growby*:  When we need to allocate more cursors, add on a group
> of this many at a time.  (Avoids many realloc() conditions under heavy use)
> Defaults to 5.
>
> The *cursors* parameter still behaves as usual, it starts up X number of
> initial cursors per connection.
>
> I did add one tiny bug fix to this as well -- it seems that the "times new
> cursor used" stat wasn't updated.  I modified the behavior to increment the
> counter when the *client* requests a new cursor.  (Even if the server has
> already allocated one)
>
> Please let me know if you find any bugs with this patch.
>
> Thanks,
>
> --Cal
>