[Simpleweb-Support] Scaling Problems with SimpleWeb
Brought to you by:
niallg
From: Kristian R. S. <kri...@ja...> - 2005-12-15 14:11:50
|
Hello,=20 =20 I have done some scaling tests of SimpleWeb 2.7.4. using a Microsoft Web Application Stress Tool.=20 =20 The OS is Linux 2.4.21-32.ELsmp #1 SMP Fri Apr 15 21:17:59 EDT 2005 i686 i686 i386 GNU/Linux. 2 CPUs - 3 Ghz and 2 GB of RAM. A Normal HP server. The tests have also been run on two other hardware architectures.=20 =20 The number of connections to SimpleWeb have been set for 4000 simultaneous connections during a test running period of 20 minutes.=20 =20 The connections are being done onto different images that are being served using SimpleWeb and an implementation of the Process method. The Process method reads the images from HashMap using the URL given.=20 =20 Furthermore, I have put in some monitoring of the number of sockets that are being opened and closed when SimpleWeb is running.=20 Both the GranularPoller and the DefaultPoller have been used using both standard and lower timeout rate settings.=20 Furthermore we use the LinuxConfigurator and we have also been using the default configuration. Here is a snapshot of the LinuxConfigurator: xxx =20 =20 The tests have been run multiple times. The results of the tests are that SimpleWeb halts its execution and stops serving images to the Web clients that connect (the Stress Tool). This happens when there are about 500 simultaneous connections have been done to SimpleWeb. Furthermore the number of connections then raises to about 2000 connections, where it stops increasing (max is 4000 connections in Stress Tool). Then when the stress tool is done, the number of connections sometimes decrease to about 500 connections, other times to about 10-20 connections. But it never decreases to 0 connections.=20 =20 When the test is run again, the number of connections arises again, but no requests are being served by SimpleWeb. It only enqueues the requests, it never dequeues them and gives a response back.=20 =20 If the tests are done with lower connections, e.g. 1000 connections for about 3-4 minutes, the SimpleWeb functions fine and serves requests as designed. It never halts. This only happens when the connections are high (4000) connections.=20 =20 After doing some analysis on the java classes, I think I have found four java classes that may be having the problems: - Processor - Scheduler - SchedulerQueue - PriorityQueue =20 The Processor class is opening the sockets and letting the pipeline method execute on the sockets. However the sockets are never really closed (except when GranularPoller closes a pipeline - a socket closes also), and a lot of the sockets convert to "Can't Identify protocols" in Linux, and when the number of sockets are as high as the max number of open file handles in Linux, SimpleWeb cannot serve any more web content (images in this case). The GranularPoller looks like it is working when the number of simultaneous connections are lower than 500 connections. However when the number of connections increases above this to about 1000-2000 connections, the GranularPoller stops closing down connections.=20 =20 The Scheduler class is used for enqueuing and dequeuing objects into a queue. However when SimpleWeb is running under high load, the enqueuing continues to work, but the dequeuing of objects stops working. I have put logging onto the class and dequeueing never happens again when SimpleWeb has been overloaded with requests.=20 =20 The SchedulerQueue class works together with Scheduler class, the dequeuing of objects never happens here either.=20 =20 The PriorityQueue continues to grow with new requests from the Stress Tool, however it is not decreased when SimpleWeb is overloaded.=20 =20 Two Solutions that work currently: I have tried out two solutions that tend to solve the problem, however the solutions are not good, because they kick away users from the webserver.=20 =20 Solution 1: Implementing a monitoring thread that monitors the number of connections open. This is the same as implementing a "Max connections allowed" in SimpleWeb, as many other web servers have. When the number of connections are too high, then we stop serving. This tends to work.=20 =20 Solution 2: Try to notify the Daemons, PriorityQueue and other Threads running, that they need to work. This is done using a "notifyAll" call to the different threads. This tends to kick in the dequeuing of the PriorityQueue, however this it does not work completety.=20 =20 Please help me with deep design details of SimpleWebs different queue structures and how it handles many connections (unlimited?) and sockets and thread locking. There seems to be some problems when running SimpleWeb with high loads. Med venlig hilsen / Best Regards, Kristian Reesen Skouboe Denmark, Europe =20 |