|
From: Leif M. <le...@ta...> - 2009-05-18 10:32:09
|
Santo, We are in the process of setting up a Solaris 10 server to do some testing with Zones in house. We have a Solaris 9 server, but out Solaris 10 testing has been done IN a zone on Sun's EZqual loaner server. I will let you know what we found. As I explained, the Wrapper never actually attempts to allocate all 1000 ports unless they are already blocked. If the first instance of your application uses ports 32000 and 31000 and that crashes, it is possible that the 32000 port will be locked for 2 minutes so the second invocation of the JVM would use 32001 and 31000. But the other 999 ports would have never been accessed so I can imagine no reason why they would be locked. In your case with Solaris Zones. You say that the Wrapper can not start on these second Zone when one is running on the first. Are you able to verify that the wrapper on the second Zone works if the first had not been running for at least 2 minutes? I am wondering if it is a configuration issue. We will be able to test this shortly ourselves. And it doesn't sound like you will be able to test it until your system is back up and running. Sorry for this next question as it may show my lack of knowledge with Solaris Zones: With your system, are both Zones sharing the same IP address? If so, they should not be able to share ports on that IP. In this case however, we are only binding to localhost, so it should not matter. I will post back as soon as we have gotten this tested out. Cheers, Leif On Mon, May 18, 2009 at 5:53 PM, Santo74 <gds...@de...> wrote: > > Hi Leif, > > thanks for the quick answer. > > Actually the issue was reported in our Mantis system by several of our > consultants and what I understood from their reports > was that the port range was reserved on all the different platforms our > product runs on. > But apparently their is no real evidence of it and therefore this isn't > necessarity true. > The reason why it was assumed that the whole range was reserved / allocated > is because they were never able > to restart our application with the same port range after a crash or forced > stop > This happended multiple times (for several different reasons, but that's not > important here) and each time > they were forced to use a port range that didn't overlap with the default > range (i.e. 31000-32000) > > On solaris on the other hand (and that's a situation that I tested and > verified myself) it's very clear > that a second instance of our application can't start with the default port > range on any other zone on that same server. > > I'm not sure what the result of netstat was on such a solaris zone and > unfortunately I can't check it at the moment > because we are having issues with the raid controller of our solaris system > which prevents it from booting :-( > > As soon as we get it up and running again and it would be of any help to you > I would be glad to run a netstat on one of the zones and post the output of > it here. > > regards, > > gds > > > > > Leif Mortenson-3 wrote: >> >> Santo, >> Sorry for this trouble with the Java Service Wrapper. We have >> designed it to work automatically when more than one copy of the >> Wrapper is run on the same machine. It does not intentionally >> allocate all 1000 ports. Rather it starts by attempting to allocate >> the first port, then moving on to the next if that first one is >> already allocated. Once it finds an open port it should never even >> attempt to access the rest of the range. >> >> How exactly are you determining that all 1000 ports are being >> reserved? allocated? when I run net stat on our test system with one >> copy of the Wrapper running I get this: >> --- >> # netstat >> >> TCP: IPv4 >> Local Address Remote Address Swind Send-Q Rwind Recv-Q >> State >> -------------------- -------------------- ----- ------ ----- ------ >> ----------- >> solx86.ssh 10.24.115.41.63503 49640 47 49640 0 >> ESTABLISHED >> localhost.63651 localhost.63650 49152 0 49152 0 >> TIME_WAIT >> localhost.31000 localhost.32000 49152 0 49152 0 >> ESTABLISHED >> localhost.32000 localhost.31000 49152 0 49170 0 >> ESTABLISHED >> >> Active UNIX domain sockets >> Address Type Vnode Conn Local Addr Remote Addr >> fffffe853d5cdac0 stream-ord fffffe855a118a80 00000000 >> /var/run/.inetd.uds >> --- >> >> Leaving the first wrapper up, I ran a couple other tests and then >> started a second wrapper. In that state here is what I get from >> netstat: >> --- >> # netstat >> >> TCP: IPv4 >> Local Address Remote Address Swind Send-Q Rwind Recv-Q >> State >> -------------------- -------------------- ----- ------ ----- ------ >> ----------- >> localhost.63661 localhost.63660 49152 0 49152 0 >> TIME_WAIT >> localhost.31001 localhost.32001 49170 0 49152 0 >> TIME_WAIT >> localhost.63667 localhost.63666 49152 0 49152 0 >> TIME_WAIT >> localhost.31002 localhost.32001 49170 0 49152 0 >> TIME_WAIT >> localhost.63670 localhost.63669 49152 0 49152 0 >> TIME_WAIT >> localhost.31003 localhost.32001 49152 0 49152 0 >> ESTABLISHED >> localhost.32001 localhost.31003 49152 0 49170 0 >> ESTABLISHED >> solx86.ssh 10.24.115.41.63503 49640 47 49640 0 >> ESTABLISHED >> localhost.31000 localhost.32000 49152 0 49152 0 >> ESTABLISHED >> localhost.32000 localhost.31000 49152 0 49170 0 >> ESTABLISHED >> >> Active UNIX domain sockets >> Address Type Vnode Conn Local Addr Remote Addr >> fffffe853d5cdac0 stream-ord fffffe855a118a80 00000000 >> /var/run/.inetd.uds >> --- >> >> The first Wrapper is using the socket 31000 -> 32000 (JVM to Wrapper) >> and the reverse of the socket. >> The second Wrapper is using the socket 31003 -> 32001 (JVM to Wrapper) >> and the reverse of the socket. >> Neither of those are ranges, they each use 2 halves of the socket as >> expected. >> The TIME_WAIT ports are from the other tests I mentioned and will >> remain in that state for 2 minutes until the system decides that no >> more data could come in and closes the ports. netstat then reports >> this: >> --- >> # netstat >> >> TCP: IPv4 >> Local Address Remote Address Swind Send-Q Rwind Recv-Q >> State >> -------------------- -------------------- ----- ------ ----- ------ >> ----------- >> localhost.31003 localhost.32001 49152 0 49152 0 >> ESTABLISHED >> localhost.32001 localhost.31003 49152 0 49170 0 >> ESTABLISHED >> solx86.ssh 10.24.115.41.63503 49640 47 49640 0 >> ESTABLISHED >> localhost.31000 localhost.32000 49152 0 49152 0 >> ESTABLISHED >> localhost.32000 localhost.31000 49152 0 49170 0 >> ESTABLISHED >> >> Active UNIX domain sockets >> Address Type Vnode Conn Local Addr Remote Addr >> fffffe853d5cdac0 stream-ord fffffe855a118a80 00000000 >> /var/run/.inetd.uds >> --- >> >> This is all worked as I expected but is also within a single Zone. >> >> When I stop the two wrappers and immediately run netstat again, I see: >> --- >> # netstat >> >> TCP: IPv4 >> Local Address Remote Address Swind Send-Q Rwind Recv-Q >> State >> -------------------- -------------------- ----- ------ ----- ------ >> ----------- >> localhost.31003 localhost.32001 49170 0 49152 0 >> TIME_WAIT >> solx86.ssh 10.24.115.41.63503 49640 47 49640 0 >> ESTABLISHED >> localhost.31000 localhost.32000 49170 0 49152 0 >> TIME_WAIT >> >> Active UNIX domain sockets >> Address Type Vnode Conn Local Addr Remote Addr >> fffffe853d5cdac0 stream-ord fffffe855a118a80 00000000 >> /var/run/.inetd.uds >> --- >> >> So the JVM to Wrapper half of the socket remains in a locked state for >> the TIME_WAIT period. This is expected because the JVM was the >> connecting process and the Wrapper was the listener. >> >> I admit that I have little experience with Solaris Zones but will >> definitely look into this further. Any additional information you >> could provide would be helpful in getting this resolved. >> >> Cheers, >> Leif >> >> On Sat, May 16, 2009 at 4:10 AM, Santo74 >> <gds...@de...> wrote: >>> >>> First of all I want to apologise for waking up an old thread, but >>> actually >>> the problem described in this thread still applies. >>> I say this because I am experiencing exactly the same behaviour as >>> ahmadk72 >>> with our own java application, which uses the java service wrapper. >>> >>> These are the remarkable things that I experienced: >>> >>> 1) the service wrapper is reserving all ports between 31000 and 32000 by >>> default >>> -> I would expect that it looks for the first free port in this range and >>> allocate / reserve that one, but apparently it reserves the whole range >>> !! >>> >>> 2) On most systems it's not a huge problem in that this port range is >>> probably available (most of the time) and therefore doesn't cause much >>> trouble. >>> However, a lot of companies don't like it that a single application is >>> occupying a range of 1000 ports >>> >>> 3) On Solaris 10 zones it's even worse because the port range appears to >>> be >>> in use on ALL zones as soon as ONE zone is running a service wrapper >>> based >>> application >>> (cfr the explanation of ahmadk72 below) >>> >>> The strangest part is that we can run multiple instances of other >>> applications that use a particular port on multiple zones without any >>> trouble at all. >>> E.g. a solaris host with 5 zones, all 5 running an IBM Tivoli Policy >>> Server >>> instance using the same ports on all those zones is not a problem. >>> So why can other applications allocate a particular port for that >>> particular >>> zone (i.e. independant), >>> while the service wrapper is allocating (reserving) its ports on all >>> zones >>> at once (i.e. global) ?? >>> >>> 4) I know it's possible to make the port range smaller AND letting each >>> server or zone use another port range, >>> but this makes it very hard to package an application for deployment >>> >>> We are using java service wrapper v3.2.3 >>> Can someone please look into this and take the necessary actions where >>> required ? >>> >>> Thanks in advance, >>> >>> gds >>> >>> >>> ahmadk72 wrote: >>>> >>>> I have a problem that is perplexed me to no end. Even our UNIX admin is >>>> stumped, so this mailing list is a last resort. >>>> >>>> We have a third-party application that uses the Java Service Wrapper >>>> (v3.2.1). The application is installed on two different Zones in >>>> Solaris >>>> 10. One zone is called rhdam-dev and the other is called rhdam-tst. >>>> >>>> If i startup the rhdam-dev (DEV box) application, the Java service >>>> wrapper >>>> uses ports 31000 and 32000 for communicating to and listening from the >>>> JVM. Everything works fine, and I have no problems running the >>>> application >>>> as expected. >>>> >>>> Then recently we installed the application on the dam-tst (TEST box) >>>> application on another Solaris zone. When the wrapper starts up, I see >>>> the following error (logging output is set to DEBUG level) in the >>>> attached >>>> log file. >>>> >>>> From what I have been able to deciper, it is complaining about port >>>> 32000 >>>> already being in use. However, the port is not in use on the rhdam-tst >>>> zone. I can run a netstat -an command and see that there is nothing >>>> running on port 32000. >>>> >>>> The really weird part is that is I stop the wrapper service on the >>>> rhdam-dev zone, and try to startup the rhdam-tst instance, then the >>>> wrapper service starts successfully. >>>> >>>> For now I have been able to start both up by changing adding the >>>> wrapper.port.min and wrappper.port.max settings so the ports don't >>>> conflict between rhdam-dev and rhdam-tst. >>>> >>>> However, I need to know why this is happening. Has anyone seen this >>>> behavior? The ports are suppose to be independent on Solaris zone. Why >>>> is the Java Service wrapper socket complaining about the port being used >>>> in another zone. >>>> >>>> If anyone can help me solve this mystery that would be just great. >>>> >>>> Thanks >>>> >>>> Kashif http://www.nabble.com/file/p12904287/artesia-service-wrapper.log >>>> artesia-service-wrapper.log |