Re: [Simpleweb-Support] Scaling Problems with SimpleWeb
Brought to you by:
niallg
From: Niall G. <gal...@ya...> - 2005-12-15 14:37:51
|
Hi Kristian, Firstly, are you using Java 1.5 or 1.4? Also, what is your memory profile like under high loads? An OutOfMemoryError can cause services to end without closing the connection. The OutOfMemoryError can also cause, on occasion, confusion within many data structures both within the Simple and Java class libraries. Have you used Jconsole (using the -Dcom.sun.management.jmxremote VM argument) to inspect the state of the VM? What is the memory usage like, also are there any dead threads (there should be none). If you are sure it is not an OutOfMemoryError, then I will certainly investigate it further. Thanks for the feedback, I really appreciate it. Niall --- Kristian Reesen Skouboe <kri...@ja...> wrote: > Hello, > > > > I have done some scaling tests of SimpleWeb 2.7.4. > using a Microsoft Web > Application Stress Tool. > > > > The OS is Linux 2.4.21-32.ELsmp #1 SMP Fri Apr 15 > 21:17:59 EDT 2005 i686 > i686 i386 GNU/Linux. 2 CPUs - 3 Ghz and 2 GB of RAM. > A Normal HP server. > The tests have also been run on two other hardware > architectures. > > > > The number of connections to SimpleWeb have been set > for 4000 > simultaneous connections during a test running > period of 20 minutes. > > > > The connections are being done onto different images > that are being > served using SimpleWeb and an implementation of the > Process method. The > Process method reads the images from HashMap using > the URL given. > > > > Furthermore, I have put in some monitoring of the > number of sockets that > are being opened and closed when SimpleWeb is > running. > > Both the GranularPoller and the DefaultPoller have > been used using both > standard and lower timeout rate settings. > > Furthermore we use the LinuxConfigurator and we have > also been using the > default configuration. Here is a snapshot of the > LinuxConfigurator: > > xxx > > > > > > The tests have been run multiple times. The results > of the tests are > that SimpleWeb halts its execution and stops serving > images to the Web > clients that connect (the Stress Tool). This happens > when there are > about 500 simultaneous connections have been done to > SimpleWeb. > Furthermore the number of connections then raises to > about 2000 > connections, where it stops increasing (max is 4000 > connections in > Stress Tool). Then when the stress tool is done, the > number of > connections sometimes decrease to about 500 > connections, other times to > about 10-20 connections. But it never decreases to 0 > connections. > > > > When the test is run again, the number of > connections arises again, but > no requests are being served by SimpleWeb. It only > enqueues the > requests, it never dequeues them and gives a > response back. > > > > If the tests are done with lower connections, e.g. > 1000 connections for > about 3-4 minutes, the SimpleWeb functions fine and > serves requests as > designed. It never halts. This only happens when the > connections are > high (4000) connections. > > > > After doing some analysis on the java classes, I > think I have found four > java classes that may be having the problems: > > - Processor > > - Scheduler > > - SchedulerQueue > > - PriorityQueue > > > > The Processor class is opening the sockets and > letting the pipeline > method execute on the sockets. However the sockets > are never really > closed (except when GranularPoller closes a pipeline > - a socket closes > also), and a lot of the sockets convert to "Can't > Identify protocols" in > Linux, and when the number of sockets are as high as > the max number of > open file handles in Linux, SimpleWeb cannot serve > any more web content > (images in this case). The GranularPoller looks like > it is working when > the number of simultaneous connections are lower > than 500 connections. > However when the number of connections increases > above this to about > 1000-2000 connections, the GranularPoller stops > closing down > connections. > > > > The Scheduler class is used for enqueuing and > dequeuing objects into a > queue. However when SimpleWeb is running under high > load, the enqueuing > continues to work, but the dequeuing of objects > stops working. I have > put logging onto the class and dequeueing never > happens again when > SimpleWeb has been overloaded with requests. > > > > The SchedulerQueue class works together with > Scheduler class, the > dequeuing of objects never happens here either. > > > > The PriorityQueue continues to grow with new > requests from the Stress > Tool, however it is not decreased when SimpleWeb is > overloaded. > > > > Two Solutions that work currently: > > I have tried out two solutions that tend to solve > the problem, however > the solutions are not good, because they kick away > users from the > webserver. > > > > Solution 1: Implementing a monitoring thread that > monitors the number of > connections open. This is the same as implementing a > "Max connections > allowed" in SimpleWeb, as many other web servers > have. When the number > of connections are too high, then we stop serving. > This tends to work. > > > > Solution 2: Try to notify the Daemons, PriorityQueue > and other Threads > running, that they need to work. This is done using > a "notifyAll" call > to the different threads. This tends to kick in the > dequeuing of the > PriorityQueue, however this it does not work > completety. > > > > Please help me with deep design details of > SimpleWebs different queue > structures and how it handles many connections > (unlimited?) and sockets > and thread locking. There seems to be some problems > when running > SimpleWeb with high loads. > > Med venlig hilsen / Best Regards, > Kristian Reesen Skouboe > Denmark, Europe > === message truncated === Niall Gallagher __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |