Menu

#437 "All pipe instances are busy" error not handled properly

closed
Named pipes (5)
5
2012-08-15
2005-08-30
No

I didn't realize this when I originally implemented
named pipes in jTDS, but there is one error condition
that should be handled differently, apparently through
some kind of back-off/retry algorithm. (Anyone
remember 10base2 network cabling?) It is documented by
this MSDN library article about implementing a named
pipe client in Windows:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ipc/base/named_pipe_client.asp

Basically, when the "All pipe instances are busy" error
is received, the jTDS driver should be doing the
equivalent of the WaitNamedPipe() function (whatever
that is; it will have to be determined).

A development team that I work with has been able to
reproduce the "All pipe instances are busy" error if
more than one person tries to access the same server
(not just the same database on the same server) at the
same time (or within a certain period of time). Note
that we've only reproduced it against SQL Server 6.5
thus far.

In theory, this could be easily tested by running the
jTDS test suite from two different computers, both
configured to hit the same database server using named
pipes.

Discussion

  • Mike Hutchinson

    Mike Hutchinson - 2005-08-31

    Logged In: YES
    user_id=641437

    David,

    I assume that you are using network pipes rather than local
    pipes? You may find the following link interesting although
    the solution described here (increasing the number of pipes
    the server listens on) is difficult to implement with jTDS
    at present as the pipe name is fixed
    http://support.microsoft.com/default.aspx?scid=KB;EN-US;165189

    I dont think JCIFS implements WaitNamedPipe. I guess
    trapping the All pipe instances are busy error from
    SmbNamedPipe, sleeping for a while and retrying a limited
    number of times is the only option.

    The real WaitNamedPipe receives a notification from the
    server when the pipe becomes available which terminates the
    wait state quicker than is possible using the simple
    approach suggested above. In practice I doubt this would be
    much of an issue with the jTDS application if the sleep time
    is short. Of course to fit with your 10 Base-2 analogy, you
    should use a random time out : -)

    Mike.

     
  • David D. Kilzer

    David D. Kilzer - 2005-08-31

    Logged In: YES
    user_id=84089

    Mike,

    Thanks for the pointer to the article! I've been wanting to
    implement a mechanism to change the named pipe path;
    depending on how we solve this issue, I may do that. (I'll
    post an RFE and a patch for review.)

    However, from that MS article, it looks like jTDS must
    implement a back-off algorithm for retrying. (The MS
    article said their code used a delay between 200 ms and 1000
    ms, although they didn't say if they randomized it or not.)
    I'll probably try to implement this as well since we want
    to continue using jTDS at work.

    Finally, note that in my testing today, I saw the "All pipe
    instances are busy" error when using both the "local Windows
    filesystem" and the "jCIFS" methods of communication.

    To reproduce the error, though, I had to have two DIFFERENT
    computers try to access the SQL Server 6.5 server at the
    same time. I tried creating a unit test that created 100
    threads and ran them all concurrently. Each thread would
    try to connect, then disconnect from the same database as
    fast as it could. You'll have to run this on one computer,
    then attempt to connect to the same database server from a
    different computer to reproduce the failures.

    Dave

     
  • David D. Kilzer

    David D. Kilzer - 2005-08-31

    Unit test that creates 100 concurrent threads to connect to a database

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-01

    Stack trace when using SharedNamedPipe with jTDS-1.1-cvs

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-01

    Stack trace when using SharedLocalNamedPipe with jTDS-1.1-cvs

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-01

    Logged In: YES
    user_id=84089

    Attaching a sample patch to fix the "All pipe instances are
    busy" exception by implementing a retry mechanism. Comments
    and feedback are welcome. I do NOT expect this patch to be
    the final fix. (I left some debugging output in it.)

    In testing, the most retries I've seen are 12. Most of the
    time, only 1 or 2 retries are needed to establish a connection.

     
  • Mike Hutchinson

    Mike Hutchinson - 2005-09-03

    Logged In: YES
    user_id=641437

    David,

    I was able to use your test case to replicate the error and
    also to show that your patch works fine. I also tested
    against a SQL 7.0 server although with only two clients I
    only got the pipe busy error once. I guess the newer servers
    have more than one thread listening by default. I tested
    both the network pipes and the local pipes options.
    I did get up to 20 retries on one PC but that was running on
    a wireless LAN while the other client was connected at 100Mbs.

    Looks good to go to me.

    Given that SQL 6.5 officially went out of support 1/1/2002
    it is surprising how many people are still using it. A case
    of if its not broke dont fix it I suppose.

    Mike.

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-06

    Logged In: YES
    user_id=84089

    Mike,

    It's not so much if-it's-not-broke-don't-fix-it as it is
    oops-we-didn't-realize-the-product-is-no-longer-supported-
    and-now-we-have-to-deal-with-it-until-we-have-the-resources-
    to-upgrade.

    I've uploaded my "final" patch. However, after reviewing my
    code, I'm wondering if the hard-coded "retryTimeout"
    variable (which is set to 20 seconds) shouldn't be replaced
    with the "loginTimeout" value. This would force users to
    set this parameter (since it defaults to 0) if they're using
    named pipes and experience the "All pipe instances are busy"
    error message.

    If I leave the "retryTimeout" in the driver would "just
    work" in most instances, but some users wouldn't understand
    why performance is degraded with named pipes, and I would be
    covering up the "All pipe instances are busy" error message
    for them. If they had timeouts that always were within one
    second of a 20-second timeout, they would see the failures
    as random and not understand why they were happening.

    Comments?

    Dave

     
  • Mike Hutchinson

    Mike Hutchinson - 2005-09-06

    Logged In: YES
    user_id=641437

    David,

    I agree that it is generally a good idea to avoid hard coded
    constants but I wonder if the use of the loginTimeout
    property isnt a bit counter intuitive. What I mean is that
    a loginTimeout value of zero usually means there is no login
    timeout but in this case any pipe in use error will cause
    an immediate failure.

    Perhaps the best compromise is to say that if loginTimeout
    is 0 then the default retry timeout of 20 seconds is used
    otherwise the loginTimeout is used.

    I understand your point about masking the underlying error
    but looking at things from a support perspective, it is
    better to have the app retry and keep working than have it
    fail with an exception that could be regarded as a natural
    consequence of the way named pipe listeners are implemented.
    Why not output a message to the logger if a retry is invoked?

    One day we should enhance the logging options so that we can
    have useful diagnostic messages such as this one without the
    massive network dump.

    Mike.

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-06

    Patch to fix "All pipe instances are busy" error plus FAQ update and TestAllPipeInstancesAreBusy class

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-06

    Logged In: YES
    user_id=84089

    Mike,

    Good point about defaulting the retry timeout to 20 seconds
    when loginTimeout is 0 (default). I'm much happier with the
    change now.

    I've already added "Logger" output to the createNamedPipe()
    method so that information is logged about retries if
    logging is enabled.

    I agree that we need a better logging infrastructure. I
    know some people hate commons-logging, but I think it's very
    useful when implemented correctly. The all-or-nothing of
    the Logger class is just way too painful. :)

    Dave

     
  • David D. Kilzer

    David D. Kilzer - 2005-09-06

    Logged In: YES
    user_id=84089

    Committed patch v3 to CVS. Closing bug.

    Dave

     

Log in to post a comment.