From: Mattison W. <mat...@ne...> - 2012-06-19 20:03:58
|
Sorry for the repeat message to some folks, but I wanted to send this to the whole list. Occasionally the tomcat process on the production or development site will go through the roof for cpu usage and stop responding until killed and restarted. I have searched through the error logs, but haven't found anything obvious to me. I am starting to suspect that trying to download certain studies in nexml format precipitates the problem. If several of these nexml files in a row start to pile up, memory usage by the JVM keeps increasing and the server stops responding. You can see this yourself if you try to download the nexml file from http://treebase-dev.nescent.org/treebase-web/search/study/summary.html?id=12742 The nexus file for that study downloads almost immediately. You can see in the screenshot one that I left running for over 10 minutes. It maybe if I left these running long enough they would complete, but in practice these requests can overwhelm the server if they are not serial (like in my test). Other "problem" study ids between 12800 and 11753 12201 killed after 10 minutes (killed nexus download after 10 minutes,but database was busy rather than java/tomcat) 12156 killed after 10 minutes (5 seconds for nexus) 12064 = 5 minutes (3 seconds for nexus) 12032 = 4 minutes (1 second for nexus) 11872 = 10 minutes (52 seconds for nexus) 11811 killed after 10 minutes (35 seconds for nexus) Note that I turned Apache caching off for these tests and will leave it off for anyone else to test. Could timeouts for uploading and downloading files be implemented so too many concurrent downloads don't disable the site? Maybe some studies are too large to be converted to nexml in a workable amount of time? Is treebase written in such a way that we could deploy many tomcat instances pointed to a common database so that if one client requested many downloads, it would only tie up one tomcat instance while the other 7 would keep serving requests happily? -- Mattison |