#26 Event Data GC stress relief

1.13 Accepted
closed-accepted
nobody
COM Data (10)
5
2014-08-11
2007-04-17
Jan
No

The VariantViaEvent objects are (and should be) JacobObjects. Previous to the patch, without exception each and every JacobObject is (during construction) entered in a per-thread object table in ROT.
Many a COM server starts a new thread for each COM event and so the thread list in ROT is ever growing, since (obviously) thread objects, at least under certain circumstances, are forever reachable inside the java VM and therefor never removed from the WeakHashMap.
For such a scenario, with this patch the resulting memory leak can be fixed by setting the system property "com.jacob.com.VariantViaEventInROT" to "false". This leads to VariantViaEvent objects being never put into the ROT (which is mostly unnecessary anyway) and, unless the event callback creates other JacobObjects, no per-thread entry being created for the callback thread.
Due to the generalized nature of the patch, also other (for example, derived) JacobObjects may be excluded from being logged into the ROT by setting the relevant "...InROT" system property to false, should the event threads need to create such objects.

Only the constructor has been changed to:

public JacobObject() {
String st = System.getProperty(getClass().getName() + "InROT");
if (st == null || !st.equalsIgnoreCase("false"))
ROT.addObject(this);
}

If the relevant property is not set at all or not set to "false", the previous behaviour remains, so no side effects on existing code are expected.

Discussion

  • Jan

    Jan - 2007-04-17

    JacobObject.java

     
  • clay_shooter

    clay_shooter - 2007-04-18

    Logged In: YES
    user_id=1189284
    Originator: NO

    That is one reason we did the autogc=true. Of course that is a more blunt instrument that can cause other issues.

     
  • Jan

    Jan - 2007-04-18

    Logged In: YES
    user_id=1741699
    Originator: YES

    autogc=true will NOT keep the undesired JacobObjects from getting into the ROT first hand, it attempts to make it easier for the gc to "repair" the situation. This patch enables the programmer to selectively keep the JacobObjects from being kept in the ROT at all, when this is not desired, and so the gc does not even have to "fix" that. In an environment like ours, with (over time) many thousands of event threads, this really makes a difference in behaviour of the gc. I have to admit this is at the cost of having a few String operations per JavaObject, but the WeakMap seems to be much more problematic.
    Another idea:
    The String operations on the heap (concatenations) could even be eliminated if each JacobObject extending class would be enforced to have a function that returns the relevant property name as a constant - then the patch would be completely heap-neutral. I just did not want to have to change all JacobObject classes at once ;-)... but the longer I think about it, the more it appears to make sense. If you'd like me to produce such a patch instead, just let me know ;-)

     
  • clay_shooter

    clay_shooter - 2007-04-22

    Logged In: YES
    user_id=1189284
    Originator: NO

    Why are "threads forever reachable"? The threads are spun up in response to the event and then terminated when the event handler is finished. The weak hashmap should be fine.

    Are you saying the threads continue to exist, that we are leaking thread handles?

     
  • Jan

    Jan - 2007-04-23

    Logged In: YES
    user_id=1741699
    Originator: YES

    I probably was a bit inexact in indicating the "threads stay reachable". The ROT map uses the thread name as key, and under certain circumstances, it seems these names are not being released from somewhere else. I'm not blaming jacob for this, nor do I even think that this is jacob's fault. It may even be that I'm a bit influencing this behaviour by observing it, as good old Heisenberg might say, because maybe it may be that some profiling tool keeps thread NAMES alive instead of creating a copied string. The thread HANDLES are definitely NOT leaking from my observation. So you might be right in accusing me of curing symptoms, not deseases, in this case. But when I have to profile a complex application, I need means to get rid of factors that practically destroy the possibility for any measurement... and switching off the ROT (unused anyway) for events greatly helps in this case.
    And, I assume, without attempting to say I know for sure, that the GC really does treat WeakMap Entries differently than "regular freed objects". Just from observation (CPU time, memory balance, profiler output), without looking at the GC source, it seems to me that not using this WeakMap when it is not needed makes it much easier for the gc.
    And, after all, the change I was proposing has next to no side effects (only one String concatenation on the heap that we could even get rid of if we really wanted, as indicated in my first follow-up, and no functional change without explicitely setting system properties), so it is probably not even worth much more discussion.

    Greetings from sunny Germany...

     
  • Jan

    Jan - 2007-05-15

    Logged In: YES
    user_id=1741699
    Originator: YES

    We now found another long-term impact of having the ROT.
    Opposed to event threads that were discussed earlier, a main thread that sends COM commands may exist for a very long time. During all that time, each and every jacob object is being kept collected in the table until said thread ceases to exist (which may be any positive amount of time after its "run" method has terminated). Unless that thread gets terminated, destroyed and reanimated from time to time, ALL jacob objects are endlessly collected.
    So, either we are forced to permanently start and quit threads for sending COM commands, or there should be a way of ideally shutting off the ROT completely for the vast majority of jacob users that never make any use of the ROT.

    For the moment, I use this patch and within the initialization code I have the three calls

    System.setProperty("com.jacob.com.VariantViaEventInROT", "false");
    System.setProperty("com.jacob.com.DispatchInROT", "false");
    System.setProperty("com.jacob.com.VariantInROT", "false");

    et voilà, all of a sudden the java-side memory issues in our application are history without any loss of functionality, performance or stability. Just switching off the VariantViaEvent collection alone brought us to 80% of our goal, the other two lines did the rest.

    As said earlier, this patch has next to no effect at all (a few simple String operations, that's it) for those who do not actively use it, so there should be no reason to reject it. And I would definitely like to quit having non-standard jacob code in my project, so I would appreciate if you could integrate this patch (or, of course, any other solution that similarly addresses this issue by circumventing the ROT).

    Greets

    Jan

     
  • clay_shooter

    clay_shooter - 2007-05-15

    Logged In: YES
    user_id=1189284
    Originator: NO

    did you try the autogc flag? It puts the JacobObjects in a weak hash map where they can be GC'd. Its similar to what you are suggesting but seems to have stability issues in some applications.

     
  • clay_shooter

    clay_shooter - 2007-05-15

    Logged In: YES
    user_id=1189284
    Originator: NO

    What about the additional string object created to match the system property for every new JacobObject?

     
  • Jan

    Jan - 2007-05-15

    Logged In: YES
    user_id=1741699
    Originator: YES

    The autogc/WeakHashMap did not help, really. The maximum it can do is allow the gc to collect the table of a thread when a thread name is no longer reachable. Obviously, for long-term threads, this does nothing, and for short-term event threads there seem to be other reasons why the thread NAME can survive the thread. Besides that, the cleaning of WeakReferences seems to be kind of unpredictable (as many aspects of the gc). I have definitely seen many occurrences of WeakReferences surviving more than one gc, although it WAS the last reference to the object. A few GCs later on, the reference was tehn killed... but it seems to impose MUCH overhead sometimes.

    The additional String objects (local variable "st", mangled class name) are not stored after use and not even weakly reachable ("st" is only a method local variable that is obsoleted directly after JavaObject construction), and these objects ARE collected by the gc immediately the next time. I was checking our application with a profiler to be sure the ROT was the last java-side leak, and it was...

     
  • Jan

    Jan - 2007-05-15

    Logged In: YES
    user_id=1741699
    Originator: YES

    Another question: What are the known stability issues with the "autogc" flag?
    Or, to ask the same question in a completely different way, what are the reasons why someone should really need the ROT and its objects?

    For our scenario, it does not seem to be of any use, and I have not found any hint in the code why it should be really necessary, although stability issues with autogc indicate that such cases do exist.

    I can only assume that it is relevant in two kinds of scenarios.

    One is scenarios that intentionally share JacobObjects between threads, but without giving every thread the necessary permanent reference (which is done "behind the scenes" by the ROT). In that case, I would call the ROT rather a "comfort feature" that assists bad programming habits... but I'm open to arguments that prove the opposite.

    The other optional purpose might have been in giving a possibility to call "cleanup" methods (safeRelease) of obsolete JacobObjects for a given thread at any point, possibly even after thread termination. In this case, jacob users will usually know about this (they wrote the delayed safeRelase calls) and rather not activate autogc (and even less, the properties for this patch). But then again, I consider this questionnable programming habits, and if this is necessary at all, I would say the creator "should" rather keep his own table for later cleanup...

    I for my part do a safeRelease whenever I see I no longer need a JacobObject, usually if not everywhere early within the same method that creates the object. With this (recommendable) programming habit, global object tables are rarely of any use.

     
  • clay_shooter

    clay_shooter - 2007-05-15

    Logged In: YES
    user_id=1189284
    Originator: NO

    I didn't write the ROT stuff but you can see why it exists if you look at the COM lifecycle documents in the docs directory in the distribution. The ROT insures that the COM objects are freed as part of the MTA/STA lifecycle and not as part of some Java object lifecycle taht might be different. The ROT is emptied for a given thread whenever the end of COM lifecycle method is called. You are really supposed to start the multi-threaded apartment model at the start of a big piece of work and then end it when the work is done. That insures that none of the COM memory is freed while it is still needed by some other related COM object. It all gets freed en-masse when the thread is done with the block of work. I doubt many Java developers know when they should or shouldn't free windows memory. It may be that the windows memory needs to be freed out of synch with when normal Java GC would on the Java side of the windows data. Does "no reference" to a Variant or Dispatch mean that the associated windows memory should be freed? Probably not in all cases.

     
  • Jan

    Jan - 2007-05-15

    Logged In: YES
    user_id=1741699
    Originator: YES

    >>The ROT insures that the COM objects are freed as part of the MTA/STA lifecycle
    and not as part of some Java object lifecycle taht might be different. <<

    >>The ROT is emptied for a given thread whenever the end of COM lifecycle method
    is called. <<

    If it is used that way.... the "autogc" allows the java gc to clean up the java references (and thus, all ways to clean up the windows mem behind them), be the windows memory still allocated or not, and thus clearly violates these life cycle rules - unless, yes, unless the programmer intentionally and expressively cleans up the windows memory on his own behalf BEFORE the java gc can do its work. That is, the WeakHashMap should only have stale references to java objects WITHOUT windows mem behind them unless the programmer did a mistake that leads to a windows memory leak in COM data.
    IF the programmer does take care and cleans up everything, there is no need for someone later on to clean up these stale references - even less if these weak references might have been killed beforehand by the gc, so such a programmer does not need a ROT at all.
    IF the programmer is or wants to be careless about this stuff, the references must never be weak to ensure every reference is still there so the windows mem can be cleaned up automatically.
    So, I see two classes of programmers: those that want the automatism of a ROT cleanup; these need hard, not weak references; and those who are the "do-it-yourself" kind like me, who are used to clean up from C++ and the like; these have no use for a ROT at all. The "compromise" of a WeakMap IS error-prone and WILL lead to difficult-to-find windows memory leaks.
    >>If you choose to call release methods on your objects yourself, that is allowed. In that case, when the thread is released the release calls will be no-ops. <<

    >> You are really supposed to start the multi-threaded apartment model at the start of a big piece of work and then end it when the work is done. <<

    Sorry, this is not always an option. The COM server in our case controls certain hardware, is slow to initialize (tens of seconds) and terminate, will try to restart the hardware on every COM init - and worst of all, I cannot change the server behaviour. I cannot afford starting and terminating this server several times a minute, otherwise our program will spend more time in hardware initialization stuff than with anything else. I MUST be able to keep connected for any unspecified amount of time. This should be not so unique, I assume there are more users that run into different kinds of problems when being forced to permanently reconnect instead of keeping the connection up and alive. I am really willing to pay the price of being responsible for cleanup order (which I am already doing successfully now), I just kindly request that the necessary "extensibility feature" of being able to disable the ROT is somehow integrated so we do not need a separate development tree for that.

    >>Does "no reference" to a Variant or Dispatch mean that the associated windows memory should be freed? Probably not in all cases. <<

    Depends on the programmer.

    IF he is of the "don't care"-type, he will not have to use the "shut-up-ROT" feature and will be as happy as (hopefully) before.

    IF he (what I recommend if it comes to using system resources like windows mem) decides to do the cleanup on his own, he of course is then responsible of keeping a reference as long as the underlying windows mem is still needed. In many if not most cases, this should even be fairly easy. Most Variants, for example, are used as function/method parameters that are obsolete after method return (the server will have its own copy of the data by then). And for the rest: how long return values of functions must last... is up to the calling function/method. Most interfaces can easily be written with such a straightforward interface object life cycle. If there is a need for a life cycle more complex, the complexity is probably always better placed in the "real" java or "real" c++ part, not in the interface itself, where the COM objects reside.

    If you think the option of shifting the cleanup responsibility of the programmer (jacob user) is dangerous... then you are invited to put a big warning sign in the doc if and where you document this feature. Noone is forced to use it...

     
  • clay_shooter

    clay_shooter - 2007-05-15

    Logged In: YES
    user_id=1189284
    Originator: NO

    We could change the behavior of autogc to eliminate the we hash map. autogc=true could operate the way your code does and never puts anything into the ROT.

    I'm trying to remember, there was some other issue around ROT where the objects had to be freed in the thread that created them so that you didn't get some type of access violation.

     
  • Jan

    Jan - 2007-05-16

    Logged In: YES
    user_id=1741699
    Originator: YES

    >>I'm trying to remember, there was some other issue around ROT where the
    objects had to be freed in the thread that created them so that you didn't
    get some type of access violation.<<

    I have seen that somewhere, too.

    This is probably the main reason for the ROT to be organized by thread, and for the requirement that each thread "should" expressively call the ROT cleanup as last COM operation.
    But if the user follows the "free every object as late as necessary, and as early as possible" rule, he will usually have the same scheme everywhere for COM calls:

    Create Variants->Call COM with them->destroy parameter Variants->Copy return value to java->Destroy return variant

    Should the need arise to share parameters or return values between threads, the shared objects shoud not be Jacob objects, but "regular java objects" with copied content of the jacob variants. The only drawback would be that you cannot communicate within variants over thread boundaries, which I would consider bad programming habit again (and clearly state so in the documentation?), as Jacob Variants are clearly interface objects and should not carry application logic.

    For event callbacks with "byval parameters", the cleanup CAN be done in the java event callback method. For event "byref parameters" (and for "byval" with "forgotten cleanup" in java), as far as I can remember, we already clean the windows mem within EventProxy::Invoke. If we ever find a windows memory leak with VariantViaEvent, it will have to be fixed exactly there - as the user will not have any option to call any cleanup code later on, the thread will never return to his java code from there. Again, it is then assured that creation and cleanup of the Variant occur within one thread - unless the user fills the byref from another thread in java space, which would be an almost "uncurable" mistake (again, docu...?): not even the ROT will help against this problem when Java object and windows memory are created in different threads...

    If you consider to make "autogc" kill the ROT... I would rather use a different name for this new option. "auto..." is a bit confusing, as in fact we DISABLE an automation feature if we kill the ROT and require the user to do his own cleanup everywhere :-)

     
  • Jan

    Jan - 2007-05-16

    Logged In: YES
    user_id=1741699
    Originator: YES

    >> The ROT is emptied for a given thread whenever the end of COM lifecycle method is called. <<

    Actually, this IS a strong reason for never wanting having any VariantViaEvent in ROT: because there is no suitable option for the user where to put the "end of COM lifecycle method" call in an event callback, at least when byref parameters are being used which must survive the user space java code. VariantViaEvent thus was the first type of java Object that I wanted to exclude from the ROT, when this hit my eyes... until later I stumbled into the "long-term thread" problem.

    Since this "end of COM lifecycle method" can never be called in this case, the IS a memory leak for Event callbacks - this was probably attempted to cure with "autogc". But I think the only absolutely clean option is having no event data ever appear in the ROT, thus never even creating a ROT entry for any event thread at all (which is what I am doing now).

    Sorry for "spamming"... ;-)

    Greets

    Jan

     
  • clay_shooter

    clay_shooter - 2007-05-17

    Logged In: YES
    user_id=1189284
    Originator: NO

    I moved the code to the ROT and changed the suffix to PutInROT so the property would be System.setProperty("com.jacob.com.VariantViaEvent.PutInROT", false);

    This change is part of 1.13-M2 now available on SourceForge

     
  • clay_shooter

    clay_shooter - 2007-05-17
    • milestone: --> 1.13 Accepted
    • status: open --> pending
     
  • SourceForge Robot

    • status: pending --> closed
     
  • SourceForge Robot

    Logged In: YES
    user_id=1312539
    Originator: NO

    This Tracker item was closed automatically by the system. It was
    previously set to a Pending status, and the original submitter
    did not respond within 14 days (the time period specified by
    the administrator of this Tracker).

     
  • clay_shooter

    clay_shooter - 2011-10-02
    • status: closed --> closed-accepted
     

Log in to post a comment.