Menu

#295 Java director connector not thread safe

None
closed
None
5
2020-08-28
2012-02-11
Brian Cole
No

The 'swig_connect_director' generated method is not thread safe due to the use of two function local static variables. The following patch fixes the problem (we are resorting to post-processing the swig output to fix the problem).

Index: Source/Modules/java.cxx

--- Source/Modules/java.cxx (revision 12905)
+++ Source/Modules/java.cxx (working copy)
@@ -4272,11 +4272,11 @@
else
internal_classname = NewStringf("%s", classname);

- Wrapper_add_localv(w, "baseclass", "static jclass baseclass", "= 0", NIL);
+ Wrapper_add_localv(w, "baseclass", "jclass baseclass", "= 0", NIL);
Printf(w->def, "void %s::swig_connect_director(JNIEnv *jenv, jobject jself, jclass jcls, bool swig_mem_own, bool weak_global) {", director_classname);

if (first_class_dmethod != curr_class_dmethod) {
- Printf(w->def, "static struct {\n");
+ Printf(w->def, "struct {\n");
Printf(w->def, "const char *mname;\n");
Printf(w->def, "const char *mdesc;\n");
Printf(w->def, "jmethodID base_methid;\n");

I do realize this patch is a large performance impact for director class construction. I timed the impact with the following test program:

public class TimeDirectorCreation {
public static class MyImageClass extends ImageBase {
MyImageClass(double w, double h) {
super(w, h);
}
}

public static void main(String[] args) {
for (int i = 0; i < 1000000; ++i)
{
MyImageClass img = new MyImageClass(200, 200);
}
}
}

Where ImageBase is a director class with 13 methods that can be overridden. The runtime of the previous goes from 7 seconds to 12 seconds, or ~%70 increased overhead. Unfortunately the only way to get around this overhead would be to have thread-safe function local statics, which aren't a guarantee until C++11. So we feel safe is better than sorry, and director class construction is already fairly slow anyway to warrant the additional overhead. Does swig provides any other facility to initialize the function local statics safely that I've missed?

Cheers,
Brian

Discussion

  • William Fulton

    William Fulton - 2012-03-26

    We could use a new SWIG_LOCAL_STATIC macro to solve this. By default making it remain the same, ie static, but you can change it to nothing to make it non-static via

    %begin %{
    #define SWIG_STATIC_LOCAL
    %}

    See patch attached.

    Alternatively, wouldn't the following work better. Change the constructors, eg from:

    public Manager(String n) {
    this(exampleJNI.new_Manager(n), true);
    exampleJNI.Manager_director_connect(this, swigCPtr, swigCMemOwn, true);
    }

    to:

    public Manager(String n) {
    this(exampleJNI.new_Manager(n), true);
    synchronized(Employee.class) {
    exampleJNI.Manager_director_connect(this, swigCPtr, swigCMemOwn, true);
    }
    }

    So in other words lock the calls into the director_connect method so that the first instance sets up the static variables correctly in the C++ code. Could you try that out and then I'll put in something to support it. It would end up modifying the SWIG_PROXY_CONSTRUCTOR macro in java.swig slightly to:

    %define SWIG_PROXY_CONSTRUCTOR(OWNERSHIP, WEAKREF, TYPENAME...)
    %typemap(javaconstruct,directorconnect="\n synchronized($baseclass.class) {\n $imclassname.$javaclazznamedirector_connect(this, swigCPtr, swigCMemOwn, WEAKREF);\n }") TYPENAME {
    this($imcall, OWNERSHIP);$directorconnect
    }
    %enddef

    and adding in support for the $baseclass expansion.

     
  • William Fulton

    William Fulton - 2012-03-26

    Patch with SWIG_LOCAL_STATIC solution

     
  • Brian Cole

    Brian Cole - 2012-03-26

    Personally, I don't like the SWIG_STATIC_LOCAL solution since I think thread safety should be the default. Especially in Java, which tends to be heavily threaded.

    Using Java synchronize on the constructor is an interesting idea. Though it's potentially just as costly as reinitializing the table every time. Not to mention it definitely won't scale as more threads are used. I'm curious to see how the performance of this approach stacks up against just reinitializing every time. I'll run this comparison once I get back into the office.

    I think the ideal solution would be to take advantage of C++11 thread-safe function local statics. GCC has had thread-safe function local statics since 4.0. So the GCC side of things could be pretty fast and safe with something like the following:

    #if defined(__GNUC__) && (__GNUC__ > 3)
    #define SWIG_STATIC_LOCAL static
    #else
    #define SWIG_STATIC_LOCAL
    #endif

    struct MethodsTable
    {
    // the table of methods
    MethodsTable(JNIEnv *jenv)
    {
    // initialize the table
    }
    };

    void SwigDirector_example::swig_connect_director(JNIEnv *jenv, ...)
    {
    SWIG_STATIC_LOCAL MethodsTable table(jenv);

    ...
    }

    Note, it's important to encapsulate all the table initialization code inside a constructor.

    Though MSVC still doesn't have thread-safe function local statics in MSVC11, sigh... but one day it should and we can update the #ifdef then. Really wish C++11 would have provided testable macros for the various features.

     
  • William Fulton

    William Fulton - 2012-03-27

    I've second thoughts about the constructor synchronize. The JNIEnv is a thread local variable and it is used to obtain the method id for storing in base_methid. I thus suspect that this is thread specific and so if it is to be stored, it needs to be thread local. Your tests should confirm.

     
  • William Fulton

    William Fulton - 2012-03-27

    The output from -Xcheck:jni might reveal more in your testcase once the statics are protected with Java locks as these checks are meant to check that the correct JNI environment is used for a thread: http://www.oracle.com/technetwork/java/javase/clopts-139448.html#gbmtq

    If we can set this information for use across all threads, we should do it just once (when the class is loaded). We could implement this more efficiently by storing the information via a static constructor call.

     
  • Brian Cole

    Brian Cole - 2012-04-13

    I can't find anything in the JNI documentation to indicate that the jmethodID objects are thread local. Though some threaded tests from java with -Xcheck:jni should definitely fetter that out. The other reason to have thread local would be if it's possible to have two different JVMs running inside the same process. But does anyone do anything that crazy? Or is it even possible?

    A cross platform thread local abstraction is also quite tricky. Ours is 315 lines of some very finicky compiler dependent C++ code that only works on GCC 3.4+ and msvc7+.

    The ifdef for GCC 4.x is a lot simpler and covers most of the bases until C++11 becomes more widely available.

     
  • William Fulton

    William Fulton - 2012-04-17

    Anyone running two JVMs in one process has probably gone bonkers! I suspect a different jmethodID is required if the class has been loaded by two different class loaders, but am hoping that a single jmethodID is required if just one class loader is involved with multiple threads. Some more investigation / experimentation is required. And in that respect a test case that duplicates the problem you solved by removing the static keyword will help enormously in working that out. Is it just a matter of creating instances of the same director class from different threads concurrently?

    I think in normal usage - using one class loader - I'd like to determine if thread local variables or thread static initialisation is what is required. I also don't want to get involved in providing thread local variables outside of using the new standard 'thread_local'.

     
  • Brian Cole

    Brian Cole - 2012-04-17

    Yes, the test case is to create instances of the same director class from different threads concurrently. Being a race condition on static initialization, it's very non-deterministic, and many programs will initialize and work fine. To debug it I ran the same program many (10-100) times looking for any one to crash.

     
  • David Piepgrass

    David Piepgrass - 2012-04-17

    Er, what's the mechanism of this crash, exactly? I don't see why it wouldn't be safe to initialize the local variables 'baseclass' and (especially) 'methods' and on two threads concurrently. My copy of SWIG initializes baseclass like so:

    if (!baseclass) {
    baseclass = jenv->FindClass("Foo");
    if (!baseclass) return;
    baseclass = (jclass) jenv->NewGlobalRef(baseclass);
    }

    FindClass() should work OK on two threads, but as a static variable, a local reference should not be assigned to it; so the code should instead be written:

    if (!baseclass) {
    jclass baseclassLocal = jenv->FindClass("Foo");
    if (!baseclassLocal) return;
    baseclass = (jclass) jenv->NewGlobalRef(baseclassLocal);
    }

    This way, the worst-case scenario is that two global references are created, and one of them leaks.

     
  • Brian Cole

    Brian Cole - 2012-04-18

    C++ does not provide any guarantee that the assignment from baseclassLocal to baseclass will be atomic. In fact, C++11 specifically calls such an assignment a data-race with undefined behavior.

     
  • William Fulton

    William Fulton - 2020-08-28
     
  • William Fulton

    William Fulton - 2020-08-28
    • status: open --> closed
    • assigned_to: William Fulton
    • Group: -->
     

Log in to post a comment.