Thread: [Proxool-developer] Spurious connections caused by Prototyper's apparently wrong connectionCount
UNMAINTAINED!
Brought to you by:
billhorsman
From: Hrvoje N. <hn...@xe...> - 2005-09-30 09:45:19
|
We're having a problem deploying Proxool 0.8.3 with Oracle. The relevant settings are: jdbc-0.proxool.maximum-connection-count 30 jdbc-0.proxool.minimum-connection-count 2 jdbc-0.proxool.house-keeping-sleep-time 60000 jdbc-0.proxool.house-keeping-test-sql select * from dual jdbc-0.proxool.test-before-use true jdbc-0.proxool.maximum-active-time 240000 jdbc-0.proxool.maximum-connection-lifetime 1800000 ... So proxool should maintain no less than 2 and no more than 30 connections to the database. But, over time, it appears that proxool opens more connections, eventually exhausting Oracle's SESSIONS_PER_USER limit. All the connections are usable and the additional ones are opened at times when there there are other usable connections available. For example, when creating the 33rd connection, Proxool's log contains: 2005-09-27 01:36:57 DEBUG [ConnectionPool] [Thread-18]: 1003445 -000257 (04/32/00) - Connection #1646 tested: OK 2005-09-27 01:36:57 DEBUG [chatdate] [Prototyper]: 1003446 -000257 (04/33/00) - Connection #1666 created to achieve minimum of 2 = AVAILABLE 04/32/00 shows that ConnectionPool is aware that it has a total of 32 connections, 28 of which are available, and yet the Prototyper thinks it has less than two open connections and is trying to open more to make up for the perceived shortage. Examining established connections with netstat shows that ConnectionPool's numbers are correct. A possible cause for this problem was outlined by Peter Radics in his message from Jan 11th, http://tinyurl.com/7eb77/. He noticed that each time HouseKeeper sweeps a connection that has been active for too long, connectionCount gets decremented twice. However, no one responded to that message and I don't know if this problem has been fixed in the development sources. I'm reluctant to apply his patch unchanged because it feels like a kludge and because the last part contains an increment I don't understand. Another problem I noticed with the Prototyper code is that it lacks synchronization when decrementing connectionCount. That cannot be good, although it is probably not the cause of this particular problem. Are you aware of a fix or a workaround for this? Thanks for your help. |
Re: [Proxool-developer] Spurious connections caused by Prototyper's apparently wrong connectionCount
From: Bill H. <arc...@gm...> - 2005-09-30 10:00:46
|
Hi Hrvoje, On 9/30/05, Hrvoje Niksic <hn...@xe...> wrote: > So proxool should maintain no less than 2 and no more than 30 > connections to the database. But, over time, it appears that proxool > opens more connections, eventually exhausting Oracle's > SESSIONS_PER_USER limit. That is strange. We've made this a lot more robust in the latest code - all connections are wrapped and specifically protect themselves against multipl= e closures. Perhaps you'd like to try out the latest, unofficial beta version= ? http://proxool.sourceforge.net/download/proxool-0.8.9b6.jar I'd be interested in your results. - Bill |
Re: [Proxool-developer] Spurious connections caused by Prototyper's
apparently wrong connectionCount
From: Hrvoje N. <hn...@xe...> - 2005-09-30 10:24:57
|
Bill Horsman <arc...@gm...> writes: > On 9/30/05, Hrvoje Niksic <hn...@xe...> wrote: > > So proxool should maintain no less than 2 and no more than 30 > connections to the database. But, over time, it appears that proxool > opens more connections, eventually exhausting Oracle's > SESSIONS_PER_USER limit. > > That is strange. We've made this a lot more robust in the latest > code - all connections are wrapped and specifically protect > themselves against multiple closures. Perhaps you'd like to try out > the latest, unofficial beta version? It doesn't seem like a good idea to try out the unofficial beta on a server already in production and used by thousands of simultaneous users. Isn't there a hint on how I could fix the problem with the Prototyper's connectionCount? Is the protection from calling reallyClose() twice the correct approach? I cannot find the source of 0.8.9b6. The sources at http://proxool.sourceforge.net/download/proxool-cvs-20050504.tgz don't seem to do things much differently from 0.8.3 -- but I could be missing something. |
Re: [Proxool-developer] Spurious connections caused by Prototyper's apparently wrong connectionCount
From: Bill H. <arc...@gm...> - 2005-09-30 11:41:02
|
Hi Hrvoje, On 9/30/05, Hrvoje Niksic <hn...@xe...> wrote: > It doesn't seem like a good idea to try out the unofficial beta on a > server already in production and used by thousands of simultaneous > users. I agree. It wasn't my suggestion. 0.8.9b6 is shortly to become RC1. There are no known problems. If you are able to test it in a non-production environment that would be very helpful. Isn't there a hint on how I could fix the problem with the > Prototyper's connectionCount? Is the protection from calling > reallyClose() twice the correct approach? I'm not sure. Does the number of extra connections created correspond with the number of reported connections that were closed automatically? Mitch's patch does make sense. You might want to consider making it thread safe. I cannot find the source of 0.8.9b6. It is /very/ similar to the latest code from CVS. (The only class that is different is AdminServlet and that has no bearing on this). That one class being different is what makes it unofficial. - Bill |
Re: [Proxool-developer] Spurious connections caused by Prototyper's
apparently wrong connectionCount
From: Hrvoje N. <hn...@xe...> - 2005-09-30 12:14:19
|
Bill Horsman <arc...@gm...> writes: > 0.8.9b6 is shortly to become RC1. There are no known problems. If > you are able to test it in a non-production environment that would > be very helpful. As you're probably guessing, the problem is that we cannot reproduce the problem in a non-production environment. Testbeds typically don't have that kind of heavy load and the same database access conditions. Faithfully duplicating production workload in a testing environment is a notoriously difficult problem. What we can do, however, is provide the (excerpts from) Proxool logs and the contents of AdminServlet, or any other data that can help you locate the problem. Just tell me if you need additional info. > Isn't there a hint on how I could fix the problem with the > Prototyper's connectionCount? Is the protection from calling > reallyClose() twice the correct approach? > > I'm not sure. Does the number of extra connections created > correspond with the number of reported connections that were closed > automatically? It's in the same ballpark, but it's hard to prove reliably because the logs are huge and scattered over several days, and I don't have them all at the moment. I'll look into it and get back to you on that. > Mitch's patch does make sense. You might want to consider making it > thread safe. I can do that. Is the additional ++connectionCount in the last hunk of his patch also in your opinion correct? Thanks for your help and the quick response! |
Re: [Proxool-developer] Spurious connections caused by Prototyper's apparently wrong connectionCount
From: Bill H. <arc...@gm...> - 2005-09-30 12:57:50
|
Hi Hrvoje, On 9/30/05, Hrvoje Niksic <hn...@xe...> wrote: > > Mitch's patch does make sense. You might want to consider making it > > thread safe. > > I can do that. Is the additional ++connectionCount in the last hunk > of his patch also in your opinion correct? Hmm. That /is/ a bug. And it's still in the latest code :( That variable only comes into effect in turning away requests quickly which is why it hasn't been spotted before. If it's wrong (too low) then it means that the "triage" stage will accept the connection but will refuse it when it discovers isn't aren't any. I think I would increment that count in a slightly different place. ConnectionPool.addProxyConnection to be precise protected void addProxyConnection(ProxyConnectionIF proxyConnection) { try { acquireConnectionStatusWriteLock(); proxyConnections.add(proxyConnection); connectionCountByState[proxyConnection.getStatus()]++; connectionCount++; // Patch } finally { releaseConnectionStatusWriteLock(); } } However, I haven't tested that. The connectionCount isn't displayed in the logs and it's not used when deciding whether to build a new connection so i= t isn't helpful to your solution. (Although, I agree it should be fixed). Back to the problem that you are experiencing. My simple unit test didn't track it down. But I have some ideas that I'll try out over the next few days. Feel free to bug me about that. - Bill - Bill |
Re: [Proxool-developer] Spurious connections caused by Prototyper's
apparently wrong connectionCount
From: Hrvoje N. <hn...@xe...> - 2005-10-02 22:49:55
|
Bill Horsman <arc...@gm...> writes: > Isn't there a hint on how I could fix the problem with the > Prototyper's connectionCount? Is the protection from calling > reallyClose() twice the correct approach? > > I'm not sure. Does the number of extra connections created > correspond with the number of reported connections that were closed > automatically? The additional connections apparently coincide with the occurrence of this messag: 2005-09-27 01:36:57 WARN [poolname] [Thread-18]: #1660 - There were some problems resetting the connection (see debug output for details). It will not be used again (just in case). The thread that is responsible is named 'Thread-18' It seems to occur after the HouseKeeper forcibly removes the connection. In our logs the number of these messages equals the number of additional connections in the pool. An example: 2005-09-27 01:36:56 DEBUG [poolname] [HouseKeeper]: 1003445 -000257 (04/31/00) - #1660 removed because it has been active for too long. 2005-09-27 01:36:56 WARN [poolname] [HouseKeeper]: #1660 was active for 144709 milliseconds and has been removed automaticaly. The Thread responsible was named 'Thread-18'. 2005-09-27 01:36:56 DEBUG [poolname] [Prototyper]: 1003445 -000257 (04/32/00) - Connection #1665 created to achieve minimum of 2 = AVAILABLE 2005-09-27 01:36:57 WARN [poolname] [Thread-18]: #1660 - There were some problems resetting the connection (see debug output for details). It will not be used again (just in case). The thread that is responsible is named 'Thread-18' 2005-09-27 01:36:57 WARN [poolname] [Thread-18]: #1660 - The connection was closed with autoCommit=false. That is fine, but it might indicate that the problems that happened whilst trying to reset it were because a transaction is still in progress. 2005-09-27 01:36:57 DEBUG [poolname] [Thread-18]: 1003445 -000257 (04/32/00) - #1660 removed because it couldn't be reset. 2005-09-27 01:36:57 WARN [poolname] [Thread-18]: Unable to set status of connection 1660 from ACTIVEto AVAILABLE. It remains NULL Note especially these two lines: 2005-09-27 01:36:56 WARN [poolname] [HouseKeeper]: #1660 was active for 144709 milliseconds and has been removed automaticaly. The Thread responsible was named 'Thread-18'. [...] 2005-09-27 01:36:57 DEBUG [poolname] [Thread-18]: 1003445 -000257 (04/32/00) - #1660 removed because it couldn't be reset. The connection #1660 has been removed twice, which means it could have decremented connectionCount twice as well. What do you think? |
Re: [Proxool-developer] Spurious connections caused by Prototyper's apparently wrong connectionCount
From: Bill H. <arc...@gm...> - 2005-10-03 09:08:23
|
Hi Hrvoje, On 10/1/05, Hrvoje Niksic <hn...@xe...> wrote: > Note especially these two lines: > > 2005-09-27 01:36:56 WARN [poolname] [HouseKeeper]: #1660 was active for > 144709 milliseconds and has been removed automaticaly. The Thread > responsible was named 'Thread-18'. > [...] > 2005-09-27 01:36:57 DEBUG [poolname] [Thread-18]: 1003445 -000257 > (04/32/00) - #1660 removed because it couldn't be reset. > > The connection #1660 has been removed twice, which means it could have > decremented connectionCount twice as well. > > What do you think? I think you're on to something there. I had a first look at this on the train this morning and I'm starting to get my head round it. I'm trying to reproduce the problem in a unit test at the moment. It's tricky getting the connection reset to fail but I can get round that. - Bill |
Re: [Proxool-developer] Spurious connections caused by Prototyper's
apparently wrong connectionCount
From: Hrvoje N. <hn...@xe...> - 2005-10-04 11:15:10
|
Bill Horsman <arc...@gm...> writes: > 2005-09-27 01:36:56 WARN [poolname] [HouseKeeper]: #1660 was active > for 144709 milliseconds and has been removed automaticaly. The Thread > responsible was named 'Thread-18'. > [...] > 2005-09-27 01:36:57 DEBUG [poolname] [Thread-18]: 1003445 -000257 > (04/32/00) - #1660 removed because it couldn't be reset. > > The connection #1660 has been removed twice, which means it could have > decremented connectionCount twice as well. > > What do you think? > > I think you're on to something there. I had a first look at this on > the train this morning and I'm starting to get my head round it. I'm > trying to reproduce the problem in a unit test at the moment. It's > tricky getting the connection reset to fail but I can get round > that. Thanks for looking into it. Please let me know if you need more information. |