Re: [Dbbalancer-users] Re: about DBbalancer

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Fri 08 Feb 2002 13:52, Andrew McMillan wrote:
> On Thu, 2002-02-07 at 13:46, Daniel Varela Santoalla wrote:

> Well not _necessarily_ synchronous.  The point is that like Apache we
> want a pre-forked pool of connections!  The (a) choice of my earlier
> e-mail, where we use an existing pre-forked connection _is_ essential,
> and having a <configurable> nunmber of these ready to handle demand is
> critical.  It's also current behaviour.

Well, this can be achieved with the current schema. Currently we create as 
many new connections as pools in each "grow" (in case of just 
ConnectionPooling, not balancing, this equals to one, as there's only one 
host, and hence one pool), but we could easily parametrize a multiplying 
factor. This would always give us enough "spare power" to handle continuing 
load growths.

>
> What this proposal is talking about are those (hopefully fairly
> infrequent) situations where that isn't enough.  Slashdot links to us
> and suddenly we are dead meat, but _how_ did we die?  We should try and
> handle this unpredictable load as far as is possible.
>
> In the real world when these situations happen we _do_ have spare
> capacity to cope with peaks - just not enough for Everest.  So we want
> to do our best to support the extra users, but we don't want to overdo
> it and end up serving nobody...

In that cases the maximum limit of connections should cope.

>
> I see.  So for incoming connections which are in a 'pending' state we
> will need a thread available?  Why?  I had assumed that it would be
> handled within the connection setup, before we have decided whether or
> not we can actually service the request.

The connections are accepted by the main thread, in DBBalancerDaemon.cc, 
method "run()". Then they are immediately enqueued till an available thread 
comes and process the request. No connection is rejected, whatever the load 
we have, just enqueued.

>
> I see this point as being a queue, rather than a parallel set of
> threads.  The connection threads would be what we would hand the
> connection off to, once we had made sure one was available.  Obviously
> this complicates the connection startup code, but it doesn't have to get
> in the way too much.

There are a queue "and" a parallel set of threads, all of them competing to 
get a "Request" off the queue.

> Yes, the inbound/outbound connection numbers should always be the same,
> although the configuration file lets you specify separate counts
> (probably needs to be fixed :-)

Sure :-). I defined those parameters previously to actually implement any 
algorithm. And I'm a little lazy to delete my code (you're never sure....) 
but it gets confusing, I know.  I'll change that soon.

>
> Could we implement what I am proposing within the current architecture
> by adding a function to your manager thread that will get it to create a
> new connection in the pool on request.  An incoming connection could
> then queue, if no connections are available then the thread requests a
> connection and enters a queue.  When the connection is started in due
> course we can hand it over to the connection request which can then
> commence work.

Some points here:
a) We have to be very careful with which "load" we add to the connection 
accepting. This runs serialized for every connection attempt in the main 
thread, so this can easily become a bottleneck if not handled with care.
b) There is no way that an "about-to-be-enqueued" request can know if it will 
have a thread/connection available. It could check the queue state but, given 
the number of concurrent threads that wouldn't be "definitive" at a given 
time.
c) If the request "signalled" the manager thread to create more 
thread/connections, it wouldn't be created anyway till the next pass of the 
manager thread (once each "daemon.reaper-delay"), and that is exactly what 
happens with the current strategy. If minimum response time is what we need 
(response time defined as time_when_new_thread/connection_is_created minus 
time_when_we_could_have_known_we_needed_one) then another strategy could be 
implemented, but I'm not so sure that we need to reduce this.....

> > And the strategy could be the current, modified, or any other. But take
> > in consideration that while Apache only pools one kind of resource
> > (processes), we pool two different (threads and connections).
>
> Sounds good.  Can you not encapsulate the thread / db connection pair?
> Are they not inseparable?

I thought about that point. But seen two difficulties:
a) The threads are hidden into the ACE_Task class, from ACE lib. But of 
course, the thread pool could be redesigned without it...
b) The thread dispatching algorithm is not very predictable. You have a lot 
of threads competing for a lock, basically. Though this wouldn't be a problem 
for a simple connection pool, for a load balancing one you would probably 
want to control more "finely" the sequence of connection assignment, in a 
different way that the thread assignment. Having them separated we can 
control independently the way we assign connections from the pools...

I'll submit to CVS the new version soon. It will hopefully include a 
"destroyer" variable load test....

Best Regards
Daniel.

-- 

----------------------------------
Regards from Spain.  Daniel Varela
----------------------------------

If you think education is expensive, try ignorance.
           -Derek Bok (Former Harvard President)