Re: [Rabbit-proxy-users] RabbIT - frequent freezes
Brought to you by:
ernimril
From: Costa T. <kt...@ho...> - 2005-01-11 23:02:21
|
> Get the 2.0.36 final release from the normal web pages > http://www.khelekore.org/rabbit/ Thanks! In the mean time I managed to find 2.0.36-pre3 which I installed. Bellow are my results. > Ok, could still be the cache. > Since you do not want any cache in rabbit you could try is to make sure > that > the cache index is not flushed to disk. You can do this by commenting > out the > entire writeCacheIndex in the file NCache.java. The file cache.index is only 159 bytes long and the last modification date is changed per hour, and the whole cache directory size seems too small (a few MB). I guess the above mean that the freezes do not come from the cache. Anyway, if you really want me to test it, I'll be glad to do it. > Ok, that looks like a memory leak is going on. I will have to check > more. > > One thing that you can also do if you still have the console where you > started > java open is to press Control-Backslask (C-\). That will give you a > full > thread > dump of the java vm. That could help, but it will probably be a long > listing, > mail it to me instead of the list. This can occacinally be highly > useful... Ok. I'll do that. In the mean time I had to solve my problem. Since I have a lot of ram I created about 40 rabbit instances and configured the front end proxy servers to load balance them. Normally, one RabbIT was running for just 1-2 minutes before start freezing. The 40 load balanced Rabbits run for several hours now without any problems. Here are a few usefull findings: The table bellow shows a summary of the memory and disk usage of each RabbIT instance. Current Memory and Memory Pool are the second 2 numbers reported by -verbose:gc. RabbIT Port Current Memory Memory Pool Cache Dir 3125 9093K 20576K 1.9M 3126 7099K 41204K 2.0M 3127 10048K 20812K 1.9M 3128 8660K 20920K 2.1M 3129 12199K 23776K 1.9M 3130 6299K 7996K 2.0M 3131 7917K 22624K 2.0M 3132 5367K 8444K 2.1M 3133 5528K 8272K 2.1M 3134 9090K 10036K 2.3M 3135 40180K 42704K 1.9M 3136 7346K 40932K 2.0M 3137 6881K 22368K 2.0M 3138 6373K 7856K 1.9M 3139 8185K 22228K 1.8M 3140 7160K 8008K 1.9M 3141 6380K 8444K 2.2M 3142 8018K 22296K 1.9M 3143 12593K 23756K 2.1M 3144 6741K 7788K 2.0M 3145 7549K 20948K 1.8M 3146 20309K 22252K 1.9M 3147 4718K 8128K 2.0M 3148 7164K 7912K 2.3M 3149 49703K 55352K 1.9M 3150 6505K 8224K 2.2M 3151 34606K 37872K 1.9M 3152 5790K 8096K 2.1M 3153 4734K 8332K 2.0M 3154 20218K 22364K 2.2M 3155 6253K 8148K 2.0M 3156 36062K 39136K 2.3M 3157 48624K 54044K 2.1M 3158 10363K 24048K 2.2M 3159 7874K 23680K 2.0M 3160 7667K 23444K 2.0M 3161 7152K 7984K 1.7M 3162 10297K 25472K 2.0M 3163 33197K 35864K 1.7M 3164 4878K 8312K 2.0M 3165 6265K 8292K 1.8M Each instance runs with: /opt/blackdown-jdk-1.4.2.01/bin/java -Xms2m -Xmx128m -Xss256k -verbose:gc Observations: ------------- 1. Each Rabbit instance has served many more requests in the last hours, than the single Rabbit that was freezing after just 1-2 minutes. This might mean that the "memory leak" is a result of the number of requests per second, or the time needed to release some resource. 2. Although the front end proxies really load balance (round-robin, one by one) all the rabbits, only a few appear to use more ram. 3. Rabbit seems to recover from the "memory leak" in several cases. Check for example instance 3126 and 3136. 4. The memory usage seems to increase in relatively huge steps for the rabbits that use more memory. For example, here is a contigious part of the GC output of instance 3149: [GC 2495K->2014K(2808K), 0.0014060 secs] [GC 2526K->2023K(2808K), 0.0010150 secs] [GC 2535K->2021K(2808K), 0.0009480 secs] [GC 2533K->2025K(2808K), 0.0007940 secs] [GC 2530K->2053K(2808K), 0.0006890 secs] [GC 2565K->2058K(2808K), 0.0009310 secs] [GC 2570K->2091K(2808K), 0.0010140 secs] [GC 2603K->2112K(2808K), 0.0008150 secs] [GC 2573K->2123K(2808K), 0.0009370 secs] [Full GC[Unloading class sun.reflect.GeneratedConstructorAccessor21] [Unloading class sun.reflect.GeneratedMethodAccessor11] [Unloading class sun.reflect.GeneratedConstructorAccessor24] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor16] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor17] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor15] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor13] [Unloading class sun.reflect.GeneratedMethodAccessor10] [Unloading class sun.reflect.GeneratedConstructorAccessor26] [Unloading class sun.reflect.GeneratedConstructorAccessor25] [Unloading class sun.reflect.GeneratedConstructorAccessor27] 2123K->1398K(2808K), 0.0285820 secs] [GC 20534K->20105K(21432K), 0.0013380 secs] [GC 20617K->20147K(21432K), 0.0012960 secs] [GC 20481K->20126K(21432K), 0.0009100 secs] [Full GC[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor18] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor19] 20126K->19837K(21432K), 0.0577450 secs] [GC 41867K->39770K(55352K), 0.0030100 secs] [GC 42074K->39809K(55352K), 0.0050160 secs] [GC 42113K->39878K(55352K), 0.0025770 secs] [GC 42175K->39899K(55352K), 0.0026060 secs] [GC 42203K->39950K(55352K), 0.0032300 secs] [GC 42254K->40025K(55352K), 0.0032420 secs] As shown, the GC logs that "suddenly" the memory usage went from 2MB to 20MB and again suddenly from 20MB to 40MB and stayed there ! On the other hand, instance 3126 which seems that managed to recover, logs this: [GC 2170K->1691K(2832K), 0.0010940 secs] [GC 2203K->1703K(2832K), 0.0010490 secs] [GC 2215K->1725K(2832K), 0.0009540 secs] [GC 2237K->1748K(2832K), 0.0009260 secs] [GC 1928K->1723K(2832K), 0.0008290 secs] [Full GC[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor16] [Unloading class sun.reflect.GeneratedConstructorAccessor23] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor15] [Unloading class sun.reflect.GeneratedConstructorAccessor24] [Unloading class sun.reflect.GeneratedMethodAccessor11] [Unloading class sun.reflect.GeneratedConstructorAccessor25] 1723K->1321K(2832K), 0.0271570 secs] [GC 20083K->19653K(21092K), 0.0014260 secs] [GC 20165K->19696K(21092K), 0.0013250 secs] [GC 20208K->19708K(21092K), 0.0011420 secs] ... (hundreds of logs that gradually increase memory) [GC 41624K->40189K(41908K), 0.0036930 secs] [GC 41725K->40222K(41908K), 0.0028510 secs] [GC 41758K->40303K(41908K), 0.0027120 secs] [GC 41837K->40367K(41908K), 0.0019370 secs] [Full GC[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor22] [Unloading class sun.reflect.GeneratedConstructorAccessor31] [Unloading class sun.reflect.GeneratedMethodAccessor14] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor23] [Unloading class sun.reflect.GeneratedConstructorAccessor32] [Unloading class sun.reflect.GeneratedConstructorAccessor33] [Unloading class sun.reflect.GeneratedMethodAccessor15] [Unloading class sun.reflect.GeneratedConstructorAccessor35] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor21] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor24] [Unloading class sun.reflect.GeneratedConstructorAccessor34] 40367K->5730K(41908K), 0.0836510 secs] [GC 8480K->5967K(41204K), 0.0083550 secs] [GC 8719K->6069K(41204K), 0.0040610 secs] [GC 8821K->6118K(41204K), 0.0054360 secs] [GC 8870K->6105K(41204K), 0.0052020 secs] Here, "Full GC" managed to release the memory back to where it should. Of course after a few hours of work. I guess the above clearly show that there is some kind of direct or indirect memory leak there, and I hope they will help you debug it. Thanks! Costa |