From: Shai <sha...@ya...> - 2011-10-17 16:08:10
|
Hi Guys, I've been banging my head against this memory leak issue for the past couple of weeks trying to resolve it and hopefully you guys can help me understand it. My code which is mostly native iPhone code, uses two threads: main and logical. I have quite allot of native objects allocated (large image objects) but I clear them all up in the finalizer code on the java side (the java object has a "pointer" and when the finalizer is invoked a native method cleans up the pointer). However, it seems the finalizers are only invoked occasionally. This effectively causes all the RAM to run out practically immediately and for some reason xcode's Instruments seems to be completely useless against this particular leak. Debugging the GC it seems to be working as expected and never increases above 3mb total heap (which makes sense for the app) but the free mem eats up the 128mb pretty quickly. Looking at FinalizerNotifier.java I see that the code uses a finalizerMutex which disables the GC, however since I have a separate thread which keeps allocating data and doesn't synchronize against this mutex won't that pose a problem? Thanks. Shai. |
From: Paul P. <bay...@gm...> - 2011-10-17 16:29:16
|
Yes, the garbage collector is temporarily disabled by design. Now, it would be wonderful if we could not disable the GC at any point, so if you have some suggestions based on the points below, lets discuss them. I will say that garbage collection can get complicated though. First, I would recommend in general, whether using XMLVM or not, to reduce the number of finalizers to as small as possible. Finalizers cause a lot of overhead & there is no guarantee when they are invoked. That is true in a normal JVM as well. Now to the issue at hand: yes, the GC is multi-threaded (single threading the GC causes other undesirable issues such as deadlocks & performance issues). So when it is determined there are finalizers to run (call this thread 1), the GC is disabled & the separate thread (call this thread 2) is notified to run the finalizers. If the GC were not disabled, thread 1 would then garbage collect the instances likely before thread 2 had a chance to run the finalizers. And you'd get an EXC_BAD_ACCESS error. As it is, thread 2 will reenable the GC once there are NO finalizers left to run. That usually means garbage collection has to wait until the next pass & hope there's not more finalizers temporarily blocking GC. So if every instance had a finalizer, the GC would never get a break to clean up. I agree this is not ideal & would prefer a better solution, but this is what we have for now. We use the Boehm GC & part of this solution is due to our understanding of the Boehm GC. Perhaps there is something available in the Boehm GC to do this better? Thanks, Paul On Mon, Oct 17, 2011 at 11:08 AM, Shai <sha...@ya...> wrote: > Hi Guys, > I've been banging my head against this memory leak issue for the past > couple of weeks trying to resolve it and hopefully you guys can help me > understand it. > > My code which is mostly native iPhone code, uses two threads: main and > logical. > I have quite allot of native objects allocated (large image objects) but I > clear them all up in the finalizer code on the java side (the java object > has a "pointer" and when the finalizer is invoked a native method cleans up > the pointer). > > However, it seems the finalizers are only invoked occasionally. This > effectively causes all the RAM to run out practically immediately and for > some reason xcode's Instruments seems to be completely useless against this > particular leak. > > Debugging the GC it seems to be working as expected and never increases > above 3mb total heap (which makes sense for the app) but the free mem eats > up the 128mb pretty quickly. > > Looking at FinalizerNotifier.java I see that the code uses a finalizerMutex > which disables the GC, however since I have a separate thread which keeps > allocating data and doesn't synchronize against this mutex won't that pose a > problem? > > > Thanks. > Shai. > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Xmlvm-developers mailing list > Xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlvm-developers > > |
From: Shai <sha...@ya...> - 2011-10-17 17:10:34
|
Thanks, I don't have a choice about using finallizers since I'm porting an existing API and the complexity of monitoring object lifecycle will make the API considerably more complex (not to mention break existing 3rd party code). Finalizers have the drawback of "double cycle" gc, but for this particular case I don't see another option. E.g. they are used within Java for stream closing and component/image peer disposal. Since IO streams have a finalizer and must always be used from a separate thread I'm guessing I'm not the only one who runs into this problem, its probably worse for my case though. The best solution would have beein invoking the finalizers on the GC thread, but since you say the performance is prohibitive (I'm already struggling with performance here) this isn't an option. Java itself doesn't handle the use case of every object having a finalizer well (which isn't my case at all) since it performs finalization instead of collection and only collects in the next cycle. I don't quite understand why GC_register_finalizer_unreachable can't be used to register a function that invokes the finalizer directly? |
From: Paul P. <bay...@gm...> - 2011-10-17 17:59:29
|
It sounds like you know what you're talking about, so forgive me if I state the obvious. For our discussion, there are 3 types of threads: 1. GC thread - created by Boehm code 2. Finalizer thread - created by XMLVM 3. All application threads So any cross-compiled Java code besides that invoked from a finalize() will occur in thread type #3. Currently, the actual garbage collection is also done in thread type #3. Every time memory is allocated, it is a chance for the GC to run in that same thread. I.e. That thread will first invoke the finalizer notifier and then run the GC. In our case, the finalizer notifier temporarily disables collection & broadcasts a message to our finalizer thread to begin finalizations in the thread of type #2. Aside from performance, if the finalize() invocations were done in the same thread instead of broadcasting to the other thread, deadlocks can occur. I.e. Consider a synchronized block in a finalize() method that could be invoked any time you dared use the "new" keyword. That to say, if we could move the garbage collection to the same thread as the finalization (not of thread type #3) so that we didn't have to disable collection at any point, that would be ideal. Right now I'm at the mercy of the community for their expertise with the Boehm GC. I have only used GC_finalizer_notifier and not GC_register_finalizer_unreachable so far. Could you explain the difference? This is good discussion to have, especially since all iOS instances have finalize() methods to release the Obj-C counterpart. Thanks, Paul On Mon, Oct 17, 2011 at 12:10 PM, Shai <sha...@ya...> wrote: > Thanks, > I don't have a choice about using finallizers since I'm porting an existing > API and the complexity of monitoring object lifecycle will make the API > considerably more complex (not to mention break existing 3rd party code). > > Finalizers have the drawback of "double cycle" gc, but for this particular > case I don't see another option. E.g. they are used within Java for stream > closing and component/image peer disposal. Since IO streams have a finalizer > and must always be used from a separate thread I'm guessing I'm not the only > one who runs into this problem, its probably worse for my case though. > > > The best solution would have beein invoking the finalizers on the GC > thread, but since you say the performance is prohibitive (I'm already > struggling with performance here) this isn't an option. > > > Java itself doesn't handle the use case of every object having a finalizer > well (which isn't my case at all) since it performs finalization instead of > collection and only collects in the next cycle. > > I don't quite understand why GC_register_finalizer_unreachable can't be > used to register a function that invokes the finalizer directly? > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Xmlvm-developers mailing list > Xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlvm-developers > |
From: Paul P. <bay...@gm...> - 2011-10-17 18:05:19
|
P.S. I misread it before. I thought you were referring to GC_finalizer_notifier & that's what I meant to type. We currently use GC_REGISTER_FINALIZER_NO_ORDER. Paul On Mon, Oct 17, 2011 at 12:59 PM, Paul Poley <bay...@gm...> wrote: > It sounds like you know what you're talking about, so forgive me if I state > the obvious. > > For our discussion, there are 3 types of threads: > > 1. GC thread - created by Boehm code > 2. Finalizer thread - created by XMLVM > 3. All application threads > > So any cross-compiled Java code besides that invoked from a finalize() will > occur in thread type #3. > > Currently, the actual garbage collection is also done in thread type #3. > Every time memory is allocated, it is a chance for the GC to run in that > same thread. I.e. That thread will first invoke the finalizer notifier and > then run the GC. > > In our case, the finalizer notifier temporarily disables collection & > broadcasts a message to our finalizer thread to begin finalizations in the > thread of type #2. Aside from performance, if the finalize() invocations > were done in the same thread instead of broadcasting to the other thread, > deadlocks can occur. I.e. Consider a synchronized block in a finalize() > method that could be invoked any time you dared use the "new" keyword. > > That to say, if we could move the garbage collection to the same thread as > the finalization (not of thread type #3) so that we didn't have to disable > collection at any point, that would be ideal. Right now I'm at the mercy of > the community for their expertise with the Boehm GC. > > I have only used GC_finalizer_notifier and not > GC_register_finalizer_unreachable so far. Could you explain the difference? > > This is good discussion to have, especially since all iOS instances have > finalize() methods to release the Obj-C counterpart. Thanks, > Paul > > > > On Mon, Oct 17, 2011 at 12:10 PM, Shai <sha...@ya...> wrote: > >> Thanks, >> I don't have a choice about using finallizers since I'm porting an >> existing API and the complexity of monitoring object lifecycle will make the >> API considerably more complex (not to mention break existing 3rd party >> code). >> >> Finalizers have the drawback of "double cycle" gc, but for this particular >> case I don't see another option. E.g. they are used within Java for stream >> closing and component/image peer disposal. Since IO streams have a finalizer >> and must always be used from a separate thread I'm guessing I'm not the only >> one who runs into this problem, its probably worse for my case though. >> >> >> The best solution would have beein invoking the finalizers on the GC >> thread, but since you say the performance is prohibitive (I'm already >> struggling with performance here) this isn't an option. >> >> >> Java itself doesn't handle the use case of every object having a finalizer >> well (which isn't my case at all) since it performs finalization instead of >> collection and only collects in the next cycle. >> >> I don't quite understand why GC_register_finalizer_unreachable can't be >> used to register a function that invokes the finalizer directly? >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2d-oct >> _______________________________________________ >> Xmlvm-developers mailing list >> Xml...@li... >> https://lists.sourceforge.net/lists/listinfo/xmlvm-developers >> > > |
From: Arno P. <ar...@pu...> - 2011-10-17 18:29:06
|
right now we use GC_register_finalizer_no_order() because otherwise there would be cycles in objects that have a finalizer (which is common in Java). If you are interested in the details, read the section of topological ordering: http://www.hpl.hp.com/personal/Hans_Boehm/gc/finalization.html The method you suggested (GC_register_finalizer_unreachable()) is an optimization but potentially introduces finalizer cycles again. I don't see a way how the code generating backend can distinguish when to use GC_register_finalizer_unreachable() and when to use GC_register_finalizer_no_order(). Arno On 10/17/11 10:10 AM, Shai wrote: > Thanks, > I don't have a choice about using finallizers since I'm porting an existing API and the complexity of monitoring object lifecycle will make the API considerably more complex (not to mention break existing 3rd party code). > > Finalizers have the drawback of "double cycle" gc, but for this particular case I don't see another option. E.g. they are used within Java for stream closing and component/image peer disposal. Since IO streams have a finalizer and must always be used from a separate thread I'm guessing I'm not the only one who runs into this problem, its probably worse for my case though. > > > The best solution would have beein invoking the finalizers on the GC thread, but since you say the performance is prohibitive (I'm already struggling with performance here) this isn't an option. > > > Java itself doesn't handle the use case of every object having a finalizer well (which isn't my case at all) since it performs finalization instead of collection and only collects in the next cycle. > > I don't quite understand why GC_register_finalizer_unreachable can't be used to register a function that invokes the finalizer directly? > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Xmlvm-developers mailing list > Xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlvm-developers |
From: Shai <sha...@ya...> - 2011-10-17 19:03:31
|
Thanks, my main point is about the necessity of the FinalizerThread more than the specific function used to bind the callback. I'll take a closer look at this. |
From: Shai <sha...@ya...> - 2011-10-17 19:02:04
|
I don't think I know much on the subject, I've ported VM's but usually treated the GC as a "black box" sort of thing. This past week I did some digging in the code (trying to pinpoint the issue) and that's the source of most of my understanding in Boehm. The issue of a deadlock or problem in a finalizer is probably the mistake of the finalizer author, many VM's fail for this use case. This can pose a problem for cases such as a finalizer trying to close a stream which might call flush() hence causing a delay or even failure. But that use case would probably be problematic with every VM. Looking at it I find the code a bit hard to follow and I'm not exactly sure its properly configured for threading to begin with. E.g. for thread support GC_stop_world() needs to be defined via GC_set_stop_func() (there is such a method in the pthread version but no one is invoking it). Anyway, if the world would be stopped during GC it might actually improve the performance and fix some of the issues you were seeing in the past. Or am I missing something here? I don't think there would be a major difference between invoking directly from the no-order method you are currently using and the method I suggested (it just seemed more appropriate since its designed for Java. I don't know how to determine the GC thread as the finalizer thread but I think setting the finalizer to invoke directly will solve this issue just as well. Thanks. |
From: Paul P. <bay...@gm...> - 2011-10-17 19:22:38
|
Consider the example below. If we invoked the finalize() in the same thread, it would deadlock & unfortunately can't be blamed on poor application code since it's unrelated/unpredictable. If both threads are attempting to "do stuff" at the same time, it would deadlock. That's one reason the finalize() invocation occurs in a different thread, but if we could do collection in that same finalizer thread, that would be nice. Thread 1: synchronized (obj1) { synchronized (obj2) { // do stuff } } Thread 2: synchronized (obj2) { new Object(); // "new" causes a finalize() invocation on the instance below } A different instance which is ready for finalization: protected void finalize() { synchronized (obj1) { // do stuff } } Thanks, Paul On Mon, Oct 17, 2011 at 2:01 PM, Shai <sha...@ya...> wrote: > I don't think I know much on the subject, I've ported VM's but usually > treated the GC as a "black box" sort of thing. This past week I did some > digging in the code (trying to pinpoint the issue) and that's the source of > most of my understanding in Boehm. > > The issue of a deadlock or problem in a finalizer is probably the mistake > of the finalizer author, many VM's fail for this use case. This can pose a > problem for cases such as a finalizer trying to close a stream which might > call flush() hence causing a delay or even failure. But that use case would > probably be problematic with every VM. > > Looking at it I find the code a bit hard to follow and I'm not exactly sure > its properly configured for threading to begin with. E.g. for thread support > GC_stop_world() needs to be defined via GC_set_stop_func() (there is such a > method in the pthread version but no one is invoking it). > Anyway, if the world would be stopped during GC it might actually improve > the performance and fix some of the issues you were seeing in the past. Or > am I missing something here? > > I don't think there would be a major difference between invoking directly > from the no-order method you are currently using and the method I suggested > (it just seemed more appropriate since its designed for Java. > I don't know how to determine the GC thread as the finalizer thread but I > think setting the finalizer to invoke directly will solve this issue just as > well. > > Thanks. > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Xmlvm-developers mailing list > Xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmlvm-developers > |