From: Till K. <kin...@gm...> - 2013-08-15 15:25:20
|
Am 15.08.2013 15:18, schrieb Osullivan L.: > Hi Folks, > > Whilst we were using VuFind 1x we had lots of problems with garbage > collection which basically getting full far too quickly, thus > necessitating frequent restarts of VuFind. That shouldn't be necessary, if you manage to find sufficient heap size and garbage collection settings. We have Solr instances running for weeks... > Having read about solr 4's > better memory management, I had hoped to see significant improvements in > this problem but unfortunately, it does seem to be the case. We are just in the process of turning our old "Solr 3.6 cluster" into a Solr 4.4 based SolrCloud environment... The garbage collection settings, that in the end worked fine with Solr 3.6, don't work well with Solr 4 for us (means too long "stop the world collections" every few hours or days (depending on settings), that we didn't have with Solr 3.6). But since we are changing the complete architecture of our Solr environment (from one large index, that is replicated several times to several machines to a 5 times split index replicated 3 times), that might not be a general rule. > Ubuntu 12.04 on Virtual Server > 10GB Ram > > JAVA_OPTIONS="-server -d64 -Xms5120m -Xmx5120m -XX:+UseParallelGC > -XX:+UseParallelOldGC -XX:+AggressiveOpts -XX:NewRatio=5 > -Xloggc:/var/log/vufind2/gc.log" This article gives some simple advice on how to find a reasonable heap size: https://support.lucidworks.com/entries/25063063-Estimating-your-heap-and-memory-requirements I'd recommend to try -XX:+UseConcMarkSweepGC toegether with -XX:+UseParNewGC as garbage collectors. With Solr 3.6 it is working nicely. There are lots of options, that you may use to fine tune behaviour of UseConcMarkSweepGC. I played with lots of them, and for example different values for -XX:NewRatio showed strong effects (8 or 9 worked quite well for us), but at some point we still run into "stop the world collections" that tear down the whole SolrCloud in a chain process and kill search for several minutes (because the whole recovery scenario in the cloud kicks in after Solr works again). So I have no final conclusion on good settings... But if you are using a recent Java 7 distribution, you can try just -XX:+UseG1GC as single garbage collection option. That's the more or less new G1 collector, that according to Oracle is ideal for high performance applications with high memory and low pause requirements (like Solr)... There are different reports about the true abilities of this thing (especially when used with Solr). For us it currently works at least better than UseConcMarkSweepGC with all kinds of esoteric settings (but maybe I just never found the right combination of settings?)... It's still not ideal, but it seems to work now... > We are currently operating at no-where near peak as most students are > off but the garbage is filling up every 1 hour and 40 mins or so. So what is happening then? Out of memory exceptions or just too painful "stop the world collections" that interrupt Solr? General rule: If Solr goes oom (out of memory), either garbage collection kicks in too late (here all the options of UseConcMarkSweepGC can help) or you simply have not enough heap space (allocate more, if not possible go and buy RAM or try to lower Solr's memory requirements by reducing cache sizes, but that will also cost performance). If there are long "stop the world pauses" (several seconds), try to reduce(!) heap size or use a different garbage collector... Till |