Menu

#1450 Performance drops suddenly (once even a crash)

5.0.0
closed
Erich
None
none
1
2023-01-01
2017-05-17
No

While trying to set up a testcase that can replicate the problems I have been reporting, the enclosed self-contained test-case reliably evokes a runtime problem which causes performance to drop significantly.

Here the brief description from the 'readme.txt' file that explains the setup and purpose of the different files:

Abmysal performance creating objects when running multihreaded Rexx programs on multiple Rexx instances, once Rexx instances get terminated!

This test application demonstrates that after terminating the two additional Rexx interpreter instances (RII) the creations of rgfTest objects on a separate thread (and also their uninits) all of a sudden drops to an abmysal peformance!

At one occasion a crash happened, which might help shed some light to the problem, so I enclosed the crash'es stack trace together with the local window data for the crash position.

1 Attachments

Related

Bugs: #1450

Discussion

1 2 > >> (Page 1 of 2)
  • Per Olov Jonsson

    I have replicated this bug on macOS, to try it out download&unpack the attached zip for MAC and read readme_mac.txt

    I do not see the "abysmal performance" mentioned the test cases run at good speed but I DO get the crash.

     
  • Per Olov Jonsson

    Running pgm_02 I do not get a crash but several errors, PLEASE look at my question below

    First Error

    Syscall param socketcall.sendto(msg) points to uninitialised byte(s)

    at sendto (in /usr/lib/system/libsystem_kernel.dylib)

    (Which is in the kernel of macOS as it seems)

    by this chain of calls

    SysSocketConnection::write(void, unsigned long, unsigned long)
    SysSocketConnection::write(void, unsigned long, void, unsigned long, unsigned long)
    ServiceMessage::writeMessage(SysClientStream&)
    ClientMessage::send(SysClientStream
    )
    ClientMessage::send()
    LocalAPIManager::establishServerConnection()
    LocalAPIManager::initProcess()
    LocalAPIContext::getAPIManager()
    RexxCreateSessionQueue

    all in librexxapi.5.0.0.dylib

    QUESTION:

    Looking into the source code of SysSocketConnection in SysCSStream.cpp The definition is:

    bool SysSocketConnection::write(void buf, size_t bufsize, size_t byteswritten)

    i.e. it takes 3 arguments and I see one call with 5 arguments, which looks weird. Can this be the cause of the first error? same applies for "bug2a"

    -- Begin Valgrind --
    Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
    ==2020== at 0x100727FA6: sendto (in /usr/lib/system/libsystem_kernel.dylib)
    ==2020== by 0x100349CD7: SysSocketConnection::write(void, unsigned long, unsigned long) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x100349D77: SysSocketConnection::write(void, unsigned long, void, unsigned long, unsigned long) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x100349636: ServiceMessage::writeMessage(SysClientStream&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x10033CB76: ClientMessage::send(SysClientStream
    ) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x10033CA77: ClientMessage::send() (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x10033D033: LocalAPIManager::establishServerConnection() (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x10033CF06: LocalAPIManager::initProcess() (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x10033CDC3: LocalAPIManager::getInstance() (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x10033CC08: LocalAPIContext::getAPIManager() (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x1003461F2: RexxCreateSessionQueue (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexxapi.5.0.0.dylib)
    ==2020== by 0x1001C16B2:

    -- End Valgrind --

    First warning (in 1st thread running libtest_gc.dylib"

    warning: no debug symbols in executable (-arch x86_64)

    Maybe more info can be harvested by adding debug symbols to libtest_gc

    -- Begin Valgrind --

    Interpreter::startInterpreter(Interpreter::InterpreterStartupMode) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== Address 0x7fff5fbfd3f2 is on thread 1's stack
    ==2020== in frame #6, created by LocalAPIManager::establishServerConnection() (???:)
    ==2020==
    --2020-- run: /usr/bin/dsymutil "/Users/po/bug3aMac/libtest_gc.dylib"
    warning: no debug symbols in executable (-arch x86_64)

    -- End Valgrind --

    2nd Error is a memory problem

    Bad permissions for mapped region at ??? in /usr/lib/libc++abi.dylib (system library)

    by a long chain of commands

    NativeActivation::run(ActivityDispatcher&)
    Activity::run(ActivityDispatcher&
    CallRoutine

    all in librexx.5.0.0.dylib

    RexxThreadContext_::CallRoutine(RexxRoutineObject, _RexxArrayObject)
    RII_CallRoutine_impl(RexxCallContext
    , void, _RexxObjectPtr, _RexxArrayObject)
    RII_CallRoutine

    in libtest_gc.dylib

    NativeActivation::callNativeRoutine(RoutineClass, NativeRoutine, RexxString, RexxObject, unsigned long, ProtectedObject&)
    NativeRoutine::call(Activity
    , RoutineClass, RexxString, RexxObject, unsigned long, ProtectedObject&)
    RoutineClass::call(Activity, RexxString, RexxObject
    , unsigned long, ProtectedObject&)
    PackageManager::callNativeRoutine(Activity, RexxString, RexxObject, unsigned long, ProtectedObject&)
    SystemInterpreter::invokeExternalFunction(RexxActivation, Activity, RexxString*, RexxObject
    , unsigned long, RexxString*, ProtectedObject&)

    in librexx.5.0.0.dylib

    -- Begin Valgrind --
    ==2020== Process terminating with default action of signal 11 (SIGSEGV)
    ==2020== Bad permissions for mapped region at address 0x100E02C28
    ==2020== at 0x100E02C28: ??? (in /usr/lib/libc++abi.dylib)
    ==2020== by 0x100136792: NativeActivation::run(ActivityDispatcher&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x100167CBD: Activity::run(ActivityDispatcher&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x10010AA7C: CallRoutine (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x103EE8785: RexxThreadContext_::CallRoutine(RexxRoutineObject, _RexxArrayObject) (in /Users/po/bug3aMac/libtest_gc.dylib)
    ==2020== by 0x103EE864D: RII_CallRoutine_impl(RexxCallContext
    , void, _RexxObjectPtr, _RexxArrayObject) (in /Users/po/bug3aMac/libtest_gc.dylib)
    ==2020== by 0x103EE85AB: RII_CallRoutine (in /Users/po/bug3aMac/libtest_gc.dylib)
    ==2020== by 0x100135C86: NativeActivation::callNativeRoutine(RoutineClass, NativeRoutine, RexxString, RexxObject, unsigned long, ProtectedObject&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x100139E75: NativeRoutine::call(Activity
    , RoutineClass, RexxString, RexxObject, unsigned long, ProtectedObject&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x1000E3CCD: RoutineClass::call(Activity, RexxString, RexxObject
    , unsigned long, ProtectedObject&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x100162412: PackageManager::callNativeRoutine(Activity, RexxString, RexxObject, unsigned long, ProtectedObject&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)
    ==2020== by 0x1001B8150: SystemInterpreter::invokeExternalFunction(RexxActivation, Activity, RexxString*, RexxObject
    , unsigned long, RexxString*, ProtectedObject&) (in /Library/Frameworks/ooRexx.framework/Versions/A/Libraries/librexx.5.0.0.dylib)

     
  • Per Olov Jonsson

    The crash can be seen running valgrind rexx pgm_01.rex

    All errors as in pgm_02.rex

    In addition a crash occurs the first time a 2nd thread is created in the same manner as for bug2a

    Thread 2:
    Invalid read of size 4
    at 0x10087D899: _pthread_body (in /usr/lib/system/libsystem_pthread.dylib)
    by 0x10087D886: _pthread_start (in /usr/lib/system/libsystem_pthread.dylib)
    by 0x10087D08C: thread_start (in /usr/lib/system/libsystem_pthread.dylib)
    Address 0x18 is not stack'd, malloc'd or (recently) free'd

     
  • Per Olov Jonsson

    This bug is related to

    1449 2a Crash while executing createStackFrame()
    1450 3a Performance drops suddenly (once even a crash)
    1459 6a Test case causing crashes, giving different stack traces
    1461 8a Crash in "MemoryObject::markObjectsMain" et.al.
    1462 8b Memory leak when creating many Rexx interpreter instances
    1463 8c Using multiple Rexx interpreter instances causing crashes
    1464 8d Further crashes with a slightly changed abc.cls

    I have added a trace with more advanced trace options but the conclusion is the same, the programs crash at the instance (no pun intended) the 2nd thread is created

     

    Last edit: Per Olov Jonsson 2017-10-14
  • Per Olov Jonsson

    Here one more trace file with these options

    valgrind --track-origins=yes -v --trace-children=yes --leak-check=full --leak-resolution=high rexx <pgm></pgm>

     
  • Erich

    Erich - 2017-10-14

    P.O. thanks for your help, though I must say, I believe you'd need a DEBUG (or RELWITHDEBINFO (1)) ooRexx build and run it in the debugger to be able to figure out the root cause of this issue

    (1) often issues which are easily reproduced with a RELEASE build, don't surface in a DEBUG build. In such a case the only alternative is a RELWITHDEBINFO build (which is more difficult to debug)

     
    • Per Olov Jonsson

      Hello Erich and thank for you reply!

      Indeed I have considered (and still consider) to create a build of my own, I have downloaded the complete trunk (I think that is the correct term?) oorexx-code-0 yesterday to start looking at the routines, methods etc that are thrown up by Valgrind. I will look at the WIKI to see if I can pull it of to MAKE my own build otherwise I will seek your guidance again.

      The compiler for Mac, Clang/LLVM seems to have some really great features for debugging. Xcode, the programming environment delivered with macOS uses that, but I might use it from the command line as I am used to do things the hard way:-).

      I will grind through all of the related bug reports from Rony and generate MAKE files for Mac for all of them, Don’t spend any more time on them unless I mention something I did not see before. All these bugg reports from Rony seems to boil down to the same thing, at the very instance a 2nd interpreter instance is created the program crashes.

      Hälsningar/Regards/Grüsse,
      P.O. Jonsson
      oorexx@jonases.se
      Von mein MacBookPro gesendet

      Am 14.10.2017 um 21:21 schrieb Erich erich_st@users.sf.net:

      P.O. thanks for your help, though I must say, I believe you'd need a DEBUG (or RELWITHDEBINFO (1)) ooRexx build and run it in the debugger to be able to figure out the root cause of this issue

      (1) often issues which are easily reproduced with a RELEASE build, don't surface in a DEBUG build. In such a case the only alternative is a RELWITHDEBINFO build (which is more difficult to debug)

      [bugs:#1450] https://sourceforge.net/p/oorexx/bugs/1450/ Performance drops suddenly (once even a crash)

      Status: open
      Group: 5.0.0
      Created: Wed May 17, 2017 12:05 PM UTC by Rony G. Flatscher
      Last Updated: Sat Oct 14, 2017 06:59 PM UTC
      Owner: nobody
      Attachments:

      abmysalPerformanceOnceACrash.zip https://sourceforge.net/p/oorexx/bugs/1450/attachment/abmysalPerformanceOnceACrash.zip (91.3 kB; application/x-zip-compressed)
      While trying to set up a testcase that can replicate the problems I have been reporting, the enclosed self-contained test-case reliably evokes a runtime problem which causes performance to drop significantly.

      Here the brief description from the 'readme.txt' file that explains the setup and purpose of the different files:

      Abmysal performance creating objects when running multihreaded Rexx programs on multiple Rexx instances, once Rexx instances get terminated!

      This test application demonstrates that after terminating the two additional Rexx interpreter instances (RII) the creations of rgfTest objects on a separate thread (and also their uninits) all of a sudden drops to an abmysal peformance!

      At one occasion a crash happened, which might help shed some light to the problem, so I enclosed the crash'es stack trace together with the local window data for the crash position.
      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/oorexx/bugs/1450/ https://sourceforge.net/p/oorexx/bugs/1450/
      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions/

       
      • Erich

        Erich - 2017-10-16

        MAKE my own build

        Hi P.O.,
        when trying to build, I'd really appreciate if you could start with our existing Cmake build setup, make any modifications necessary for Darwin regarding e. g.

        • install locations (/usr/local/bin, /usr/local/lib, /Users/userid/Applications/ooRexx5.0.0/bin, etc.) or
        • rxapid (see [bugs:#1476]),
        • and whatever else is necessary to build a working Darwin install package

        As of now, we don't have this. Rony's builds are custom-made "make" builds, and the Darwin build running on our Jenkins build machine (netrexx.org/jenkins/) doesn't make its build rpm's publicly available

        If you'd be willing to set up your Mac to run as a Jenins slave to our biuld machine, I'd offer any help any help I can provide

         

        Related

        Bugs: #1476

        • Per Olov Jonsson

          Hello Erich,

          I am quite close to have my own build, I am documenting what I do as I go along, hopefully it can be added to the WIKI later. I guess still some work is necessary to prepare a standalone installer (I have no knowledge in that corner) but one step at the time. I am in Spain on holidays and do this using my telephone as a bridge to the internet so only very little time occasionally to do something. Next week I am in Berlin and can set up one of my Macs as a slave, I trust I can come back to you for directions?

          Most of the tools necessary (like SVN) are already present on the Mac so only problem now is the cmake settings (and nmake)

          I have come so far as this and is experimenting with the settings

          cmake -G "Unix Makefiles" /volumes/"Macintosh HD"/Users/po/oorexxsvn/main/trunk

          CMake Deprecation Warning at CMakeLists.txt:43 (cmake_policy):
          The OLD behavior for policy CMP0010 will be removed from a future version
          of CMake.

          The cmake-policies(7) manual explains that the OLD behaviors of all
          policies are deprecated and that a policy should be set to OLD only under
          specific short-term circumstances. Projects should be ported to the NEW
          behavior and not rely on setting a policy to OLD.

          -- The C compiler identification is AppleClang 9.0.0.9000037
          -- The CXX compiler identification is AppleClang 9.0.0.9000037
          -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc

          <snip></snip>

          CMake Warning (dev):
          Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake
          --help-policy CMP0042" for policy details. Use the cmake_policy command to
          set the policy and suppress this warning.

          MACOSX_RPATH is not specified for the following targets:

          orxclassic
          orxclassic1
          orxexits
          orxfunction
          orxinvocation
          orxmethod
          wpipe1
          wpipe2
          wpipe3

          This warning is for project developers. Use -Wno-dev to suppress it.

          -- Generating done
          -- Build files have been written to: /Users/po/oorexxbuild

          But I have not had time to test the build, also I cannot judge the seriousness of these warnigns/errors. I will continue to experiment, all help is welcome.

          These are the options I have

          Generators
          Unix Makefiles = Generates standard UNIX makefiles.
          Ninja = Generates build.ninja files.
          Xcode = Generate Xcode project files.
          CodeBlocks - Ninja = Generates CodeBlocks project files.
          CodeBlocks - Unix Makefiles = Generates CodeBlocks project files.
          CodeLite - Ninja = Generates CodeLite project files.
          CodeLite - Unix Makefiles = Generates CodeLite project files.
          Sublime Text 2 - Ninja = Generates Sublime Text 2 project files.
          Sublime Text 2 - Unix Makefiles
          = Generates Sublime Text 2 project files.
          Kate - Ninja = Generates Kate project files.
          Kate - Unix Makefiles = Generates Kate project files.
          Eclipse CDT4 - Ninja = Generates Eclipse CDT 4.0 project files.
          Eclipse CDT4 - Unix Makefiles= Generates Eclipse CDT 4.0 project files.
          KDevelop3 = Generates KDevelop 3 project files.
          KDevelop3 - Unix Makefiles = Generates KDevelop 3 project files.

          I have no idea what the other optionas are but I guess Xcode might be good for MAC since that is the standard programming environment/debugger.

          If you want to look at the complete info from the cmake run let me know, I have saved it.

          Hälsningar/Regards/Grüsse,
          P.O. Jonsson
          oorexx@jonases.se
          Von mein MacBookPro gesendet

          Am 16.10.2017 um 10:56 schrieb Erich erich_st@users.sf.net:

          MAKE my own build

          Hi P.O.,
          when trying to build, I'd really appreciate if you could start with our existing Cmake build setup, make any modifications necessary for Darwin regarding e. g.

          install locations (/usr/local/bin, /usr/local/lib, /Users/userid/Applications/ooRexx5.0.0/bin, etc.) or
          rxapid (see [bugs:#1476] https://sourceforge.net/p/oorexx/bugs/1476/),
          and whatever else is necessary to build a working Darwin install package
          As of now, we don't have this. Rony's builds are custom-made "make" builds, and the Darwin build running on our Jenkins build machine (netrexx.org/jenkins/) doesn't make its build rpm's publicly available

          If you'd be willing to set up your Mac to run as a Jenins slave to our biuld machine, I'd offer any help any help I can provide

          [bugs:#1450] https://sourceforge.net/p/oorexx/bugs/1450/ Performance drops suddenly (once even a crash)

          Status: open
          Group: 5.0.0
          Created: Wed May 17, 2017 12:05 PM UTC by Rony G. Flatscher
          Last Updated: Mon Oct 16, 2017 07:13 AM UTC
          Owner: nobody
          Attachments:

          abmysalPerformanceOnceACrash.zip https://sourceforge.net/p/oorexx/bugs/1450/attachment/abmysalPerformanceOnceACrash.zip (91.3 kB; application/x-zip-compressed)
          While trying to set up a testcase that can replicate the problems I have been reporting, the enclosed self-contained test-case reliably evokes a runtime problem which causes performance to drop significantly.

          Here the brief description from the 'readme.txt' file that explains the setup and purpose of the different files:

          Abmysal performance creating objects when running multihreaded Rexx programs on multiple Rexx instances, once Rexx instances get terminated!

          This test application demonstrates that after terminating the two additional Rexx interpreter instances (RII) the creations of rgfTest objects on a separate thread (and also their uninits) all of a sudden drops to an abmysal peformance!

          At one occasion a crash happened, which might help shed some light to the problem, so I enclosed the crash'es stack trace together with the local window data for the crash position.
          Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/oorexx/bugs/1450/ https://sourceforge.net/p/oorexx/bugs/1450/
          To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions/

           
        • Per Olov Jonsson

          Dear Erich,

          I have had some time again and got one step further.

          My first goal is to make a running, standalone version of ooRexx from scratch, after that I will try to make an installer.

          Problem 1: There are a number of warnings from Cmake that I do not really know how to fix.

          Problem 2: running make I get several warnings and in the end 2 errors, probably as a result of 1 above?

          Remember I am doing this for the absolute first time so things that are obvious to you may not be obvious to me. Given that I cannot change anything in the source, where should I do the modifications to make this work? I will read your hints below in detail (again)

          I have CCed Rony since I hope he might have some intel to share (that would be much appreciated)

          Here is the shortened version of my trial, I have enclosed the entire journey as a text file:

          CMake Warning (dev):
          Policy CMP0042 is not set: MACOSX_RPATH is enabled by default. Run "cmake
          --help-policy CMP0042" for policy details. Use the cmake_policy command to
          set the policy and suppress this warning.

          MACOSX_RPATH is not specified for the following targets:

          orxclassic
          orxclassic1
          orxexits
          orxfunction
          orxinvocation
          orxmethod
          wpipe1
          wpipe2
          wpipe3

          This warning is for project developers. Use -Wno-dev to suppress it.

          -- Generating done
          -- Build files have been written to: /Users/po/oorexxbuild/release

          POs-MacBook-Pro:release po$ make

          [ 9%] Building CXX object CMakeFiles/rexxapi.dir/common/platform/unix/SysThread.cpp.o
          /Users/po/oorexxsvn/main/trunk/common/platform/unix/SysThread.cpp:90:32: warning:
          unknown warning group '-Wreturn-local-addr', ignored
          [-Wunknown-warning-option]

          pragma GCC diagnostic ignored "-Wreturn-local-addr"

                                     ^
          

          1 warning generated.

          [ 66%] Building CXX object CMakeFiles/rexx.dir/interpreter/platform/unix/SysActivity.cpp.o
          /Users/po/oorexxsvn/main/trunk/interpreter/platform/unix/SysActivity.cpp:172:32: warning:
          unknown warning group '-Wreturn-local-addr', ignored
          [-Wunknown-warning-option]

          pragma GCC diagnostic ignored "-Wreturn-local-addr"

                                     ^
          

          1 warning generated.

          [ 67%] Building CXX object CMakeFiles/rexx.dir/interpreter/platform/unix/SysFileSystem.cpp.o
          /Users/po/oorexxsvn/main/trunk/interpreter/platform/unix/SysFileSystem.cpp:174:12: warning:
          'tmpnam' is deprecated: This function is provided for compatibility
          reasons only. Due to security concerns inherent in the design of
          tmpnam(3), it is highly recommended that you use mkstemp(3) instead.
          [-Wdeprecated-declarations]
          return tmpnam(NULL);
          ^
          /usr/include/stdio.h:275:1: note: 'tmpnam' has been explicitly marked deprecated
          here
          deprecated_msg("This function is provided for compatibility reasons on...
          ^
          /usr/include/sys/cdefs.h:180:48: note: expanded from macro '__deprecated_msg'
          #define __deprecated_msg(_msg) __attribute
          ((deprecated(_msg)))
          ^
          1 warning generated.

          [ 70%] Building CXX object CMakeFiles/rexx.dir/common/platform/unix/SysThread.cpp.o
          /Users/po/oorexxsvn/main/trunk/common/platform/unix/SysThread.cpp:90:32: warning:
          unknown warning group '-Wreturn-local-addr', ignored
          [-Wunknown-warning-option]

          pragma GCC diagnostic ignored "-Wreturn-local-addr"

                                     ^
          

          1 warning generated.

          [ 74%] Linking CXX shared library bin/librxunixsys.dylib
          ld: library not found for -lcrypt
          clang: error: linker command failed with exit code 1 (use -v to see invocation)
          make[2]: *** [bin/librxunixsys.5.0.0.dylib] Error 1
          make[1]: *** [CMakeFiles/rxunixsys.dir/all] Error 2
          make: *** [all] Error 2
          POs-MacBook-Pro:release po$

          This is where I am today.

          Hälsningar/Regards/Grüsse,
          P.O. Jonsson
          oorexx@jonases.se
          Von mein MacBookPro gesendet

          Am 16.10.2017 um 10:56 schrieb Erich erich_st@users.sf.net:

          MAKE my own build

          Hi P.O.,
          when trying to build, I'd really appreciate if you could start with our existing Cmake build setup, make any modifications necessary for Darwin regarding e. g.

          install locations (/usr/local/bin, /usr/local/lib, /Users/userid/Applications/ooRexx5.0.0/bin, etc.) or
          rxapid (see [bugs:#1476] https://sourceforge.net/p/oorexx/bugs/1476/),
          and whatever else is necessary to build a working Darwin install package
          As of now, we don't have this. Rony's builds are custom-made "make" builds, and the Darwin build running on our Jenkins build machine (netrexx.org/jenkins/) doesn't make its build rpm's publicly available

          If you'd be willing to set up your Mac to run as a Jenins slave to our biuld machine, I'd offer any help any help I can provide

          [bugs:#1450] https://sourceforge.net/p/oorexx/bugs/1450/ Performance drops suddenly (once even a crash)

          Status: open
          Group: 5.0.0
          Created: Wed May 17, 2017 12:05 PM UTC by Rony G. Flatscher
          Last Updated: Mon Oct 16, 2017 07:13 AM UTC
          Owner: nobody
          Attachments:

          abmysalPerformanceOnceACrash.zip https://sourceforge.net/p/oorexx/bugs/1450/attachment/abmysalPerformanceOnceACrash.zip (91.3 kB; application/x-zip-compressed)
          While trying to set up a testcase that can replicate the problems I have been reporting, the enclosed self-contained test-case reliably evokes a runtime problem which causes performance to drop significantly.

          Here the brief description from the 'readme.txt' file that explains the setup and purpose of the different files:

          Abmysal performance creating objects when running multihreaded Rexx programs on multiple Rexx instances, once Rexx instances get terminated!

          This test application demonstrates that after terminating the two additional Rexx interpreter instances (RII) the creations of rgfTest objects on a separate thread (and also their uninits) all of a sudden drops to an abmysal peformance!

          At one occasion a crash happened, which might help shed some light to the problem, so I enclosed the crash'es stack trace together with the local window data for the crash position.
          Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/oorexx/bugs/1450/ https://sourceforge.net/p/oorexx/bugs/1450/
          To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ https://sourceforge.net/auth/subscriptions/

           

          Related

          Bugs: #1450
          Bugs: #1476

    • Per Olov Jonsson

      Please find attached a trace file made with debug build of ooRexx. I still get "Segmentation fault: 11" when running this test case on macOS

      -Checked out revision 11317, complete
      -cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=DEBUG ../../oorexxsvn/main/trunk
      -make

       
  • Moritz Hoffmann

    Moritz Hoffmann - 2017-10-16

    The attached example also crashes on Linux with current trunk. From what I see I suspect a memory corruption issue. With debugging enabled, one of the threads hangs in DeadObject::getObjectSize, without debugging I'm getting random segfaults. I'm almost certain that passing memory locations between different instances causes the problem because each instance has its own allocator. Easy solution is to avoid doing that, second solution is to make sure that pointers are not passed between different allocations, third solution is to be very careful when passing pointers. As a fail-early mechanism one could trigger a GC cycle on every thread attach/detach point.
    I am not particularly inclined to debugging the problem as it seems like a bad example to me. Rexx has a global interpreter lock so performance won't go up when using multiple threads - only interleaving is possible. However, it would be possible to write a serializer in the library code that makes sure only one Rexx interpreter exists at any point in time, or that the library only talks to one.

     
    • Erich

      Erich - 2017-10-16

      each instance has its own allocator.
      Easy solution is to avoid doing that

      Moritz, if that's a quick fix .. what code changes would be required?
      Or do you mean: don't use multiple instances?

       
      • Moritz Hoffmann

        Moritz Hoffmann - 2017-10-16

        I don't know a quick fix. Not using multiple instances might solve the problem right now.

         
      • Anonymous

        Anonymous - 2017-10-16

        There is only one memory allocator per process. All objects are allocated from the same memory heap. The allocator is serialized by the use of the kernel lock that only allows one thread access to the interpreter code at one time. Objects can be shared just fine between instances.

        If I had to make a guess, something during the creation of a new instance is allocating an object while it does not hold the kernel lock, resulting in two threads calling the object allocator at the same time. A lot of the crashes could be explained by that happening, but I've not been able to spot any windows in the code where that might be occurring.

        Unfortunately, people are only posting the top few entries of execution stacks and generally only the thread where the failures occur. Full stacks of all of the threads might actually identify where this is happening.

         
        • Rick McGuire

          Rick McGuire - 2017-10-16

          Oops, didn't notice I wasn't logged in. This was from Rick.

           
        • Moritz Hoffmann

          Moritz Hoffmann - 2017-10-16

          Hi Rick, thanks for the clarification. Attached you'll find two different traces, one resulting in a segmentation fault, the other just hanging in DeadObject. Let me know if you need more info, I can rerun things while in the office.

           
          • Rick McGuire

            Rick McGuire - 2017-10-16

            We may have a winner. The hang trace pinpointed a place where objects were being allocated without holding the lock. This patch might fix the problem.

             
  • Rick McGuire

    Rick McGuire - 2017-10-16

    Hmmmm, the hang traces likely point to a bug in the TableIterator class. This iterator is used when traversing the uninit table and is intended to allow iteration with the ability to delete items without affecting the iteration. It looks like it is stuck in an infinite loop. If that occurs again, you might be able to figure out what sort of condition caused that failure. The rest of the threads look like they are in good places.

     
    • Rick McGuire

      Rick McGuire - 2017-10-16

      Oops, strike that, I was looking at the segv trace, not the hang.

       
  • Erich

    Erich - 2017-10-16

    Committed Rick's 1450.patch code fix with revision [r11312]

    Moritz, can you test if that fixes the issue (or parts thereof)?

     

    Related

    Commit: [r11312]

    • Moritz Hoffmann

      Moritz Hoffmann - 2017-10-16

      It seems to have fixed the memory corruption issue. Now it's getting stuck in phthread_cond_wait, but that might be a problem of the example. Nevertheless, attached are the stack traces.

       
      • Rick McGuire

        Rick McGuire - 2017-10-16

        On Mon, Oct 16, 2017 at 7:51 AM, Moritz Hoffmann antiguru@users.sf.net
        wrote:

        It seems to have fixed the memory corruption issue. Now it's getting stuck
        in phthread_cond_wait, but that might be a problem of the example.
        Nevertheless, attached are the stack traces.

        I suspect it is. I noticed that several threads were in a guard wait state
        in the hang trace.

        Rick


        ** [bugs:#1450] Performance drops suddenly (once even a crash)**

        Status: open
        Group: 5.0.0
        Created: Wed May 17, 2017 12:05 PM UTC by Rony G. Flatscher
        Last Updated: Mon Oct 16, 2017 11:31 AM UTC
        Owner: nobody
        Attachments:

        While trying to set up a testcase that can replicate the problems I have
        been reporting, the enclosed self-contained test-case reliably evokes a
        runtime problem which causes performance to drop significantly.

        Here the brief description from the 'readme.txt' file that explains the
        setup and purpose of the different files:

        ~~~
        Abmysal performance creating objects when running multihreaded Rexx
        programs on multiple Rexx instances, once Rexx instances get terminated!

        This test application demonstrates that after terminating the two
        additional Rexx interpreter instances (RII) the creations of rgfTest
        objects on a separate thread (and also their uninits) all of a sudden drops
        to an abmysal peformance!

        At one occasion a crash happened, which might help shed some light to the
        problem, so I enclosed the crash'es stack trace together with the local
        window data for the crash position.
        ~~~


        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/oorexx/bugs/1450/>

        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>

         

        Related

        Bugs: #1450

      • Rick McGuire

        Rick McGuire - 2017-10-16

        Was able to look at the tracebacks. All of the threads but one are blocked trying to cal a guarded method on the same object. The remaining thread is in a guard wait obviously waiting for something to happen that will not occur because all of the threads are blocked.

         
  • Erich

    Erich - 2017-10-16
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -2,10 +2,9 @@
    
     Here the brief description from the 'readme.txt' file that explains the setup and purpose of the different files:
    
    -~~~
    +
     Abmysal performance creating objects when running multihreaded Rexx programs on multiple Rexx instances, once Rexx instances get terminated!
    
     This test application demonstrates that after terminating the two additional Rexx interpreter instances (RII) the creations of rgfTest objects on a separate thread (and also their uninits) all of a sudden drops to an abmysal peformance!
    
     At one occasion a crash happened, which might help shed some light to the problem, so I enclosed the crash'es stack trace together with the local window data for the crash position.
    -~~~
    
    • status: open --> accepted
    • assigned_to: Erich
     
1 2 > >> (Page 1 of 2)

Anonymous
Anonymous

Add attachments
Cancel





MongoDB Logo MongoDB