jTDS - SQL Server and Sybase JDBC driver / Bugs / #437 "All pipe instances are busy" error not handled properly

Mike Hutchinson - 2005-08-31

Logged In: YES
user_id=641437

David,

I assume that you are using network pipes rather than local
pipes? You may find the following link interesting although
the solution described here (increasing the number of pipes
the server listens on) is difficult to implement with jTDS
at present as the pipe name is fixed
http://support.microsoft.com/default.aspx?scid=KB;EN-US;165189

I dont think JCIFS implements WaitNamedPipe. I guess
trapping the All pipe instances are busy error from
SmbNamedPipe, sleeping for a while and retrying a limited
number of times is the only option.

The real WaitNamedPipe receives a notification from the
server when the pipe becomes available which terminates the
wait state quicker than is possible using the simple
approach suggested above. In practice I doubt this would be
much of an issue with the jTDS application if the sleep time
is short. Of course to fit with your 10 Base-2 analogy, you
should use a random time out : -)

Mike.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-08-31

Logged In: YES
user_id=84089

Mike,

Thanks for the pointer to the article! I've been wanting to
implement a mechanism to change the named pipe path;
depending on how we solve this issue, I may do that. (I'll
post an RFE and a patch for review.)

However, from that MS article, it looks like jTDS must
implement a back-off algorithm for retrying. (The MS
article said their code used a delay between 200 ms and 1000
ms, although they didn't say if they randomized it or not.)
I'll probably try to implement this as well since we want
to continue using jTDS at work.

Finally, note that in my testing today, I saw the "All pipe
instances are busy" error when using both the "local Windows
filesystem" and the "jCIFS" methods of communication.

To reproduce the error, though, I had to have two DIFFERENT
computers try to access the SQL Server 6.5 server at the
same time. I tried creating a unit test that created 100
threads and ran them all concurrently. Each thread would
try to connect, then disconnect from the same database as
fast as it could. You'll have to run this on one computer,
then attempt to connect to the same database server from a
different computer to reproduce the failures.

Dave

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-08-31

Unit test that creates 100 concurrent threads to connect to a database

AllPipeInstancesAreBusyTest.java

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-01

Stack trace when using SharedNamedPipe with jTDS-1.1-cvs

bug-1277000-sharednamedpipe-apiab-exception.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-01

Stack trace when using SharedLocalNamedPipe with jTDS-1.1-cvs

bug-1277000-sharedlocalnamedpipe-apiab-exception.txt

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-01

Logged In: YES
user_id=84089

Attaching a sample patch to fix the "All pipe instances are
busy" exception by implementing a retry mechanism. Comments
and feedback are welcome. I do NOT expect this patch to be
the final fix. (I left some debugging output in it.)

In testing, the most retries I've seen are 12. Most of the
time, only 1 or 2 retries are needed to establish a connection.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mike Hutchinson - 2005-09-03

Logged In: YES
user_id=641437

David,

I was able to use your test case to replicate the error and
also to show that your patch works fine. I also tested
against a SQL 7.0 server although with only two clients I
only got the pipe busy error once. I guess the newer servers
have more than one thread listening by default. I tested
both the network pipes and the local pipes options.
I did get up to 20 retries on one PC but that was running on
a wireless LAN while the other client was connected at 100Mbs.

Looks good to go to me.

Given that SQL 6.5 officially went out of support 1/1/2002
it is surprising how many people are still using it. A case
of if its not broke dont fix it I suppose.

Mike.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-06

Logged In: YES
user_id=84089

Mike,

It's not so much if-it's-not-broke-don't-fix-it as it is
oops-we-didn't-realize-the-product-is-no-longer-supported-
and-now-we-have-to-deal-with-it-until-we-have-the-resources-
to-upgrade.

I've uploaded my "final" patch. However, after reviewing my
code, I'm wondering if the hard-coded "retryTimeout"
variable (which is set to 20 seconds) shouldn't be replaced
with the "loginTimeout" value. This would force users to
set this parameter (since it defaults to 0) if they're using
named pipes and experience the "All pipe instances are busy"
error message.

If I leave the "retryTimeout" in the driver would "just
work" in most instances, but some users wouldn't understand
why performance is degraded with named pipes, and I would be
covering up the "All pipe instances are busy" error message
for them. If they had timeouts that always were within one
second of a 20-second timeout, they would see the failures
as random and not understand why they were happening.

Comments?

Dave

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mike Hutchinson - 2005-09-06

Logged In: YES
user_id=641437

David,

I agree that it is generally a good idea to avoid hard coded
constants but I wonder if the use of the loginTimeout
property isnt a bit counter intuitive. What I mean is that
a loginTimeout value of zero usually means there is no login
timeout but in this case any pipe in use error will cause
an immediate failure.

Perhaps the best compromise is to say that if loginTimeout
is 0 then the default retry timeout of 20 seconds is used
otherwise the loginTimeout is used.

I understand your point about masking the underlying error
but looking at things from a support perspective, it is
better to have the app retry and keep working than have it
fail with an exception that could be regarded as a natural
consequence of the way named pipe listeners are implemented.
Why not output a message to the logger if a retry is invoked?

One day we should enhance the logging options so that we can
have useful diagnostic messages such as this one without the
massive network dump.

Mike.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-06

Patch to fix "All pipe instances are busy" error plus FAQ update and TestAllPipeInstancesAreBusy class

jtds-1.1-cvs-fix-allpipeinstancesarebusy-v3.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-06

Logged In: YES
user_id=84089

Mike,

Good point about defaulting the retry timeout to 20 seconds
when loginTimeout is 0 (default). I'm much happier with the
change now.

I've already added "Logger" output to the createNamedPipe()
method so that information is logged about retries if
logging is enabled.

I agree that we need a better logging infrastructure. I
know some people hate commons-logging, but I think it's very
useful when implemented correctly. The all-or-nothing of
the Logger class is just way too painful. :)

Dave

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David D. Kilzer - 2005-09-06

Logged In: YES
user_id=84089

Committed patch v3 to CVS. Closing bug.

Dave

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

"All pipe instances are busy" error not handled properly

Group

Searches

Help

#437 "All pipe instances are busy" error not handled properly

Discussion