#847 rxapi crash om aix 6.1 TL2SP3

v4.0.1
closed
nobody
5
2012-08-14
2009-10-13
Henning G
No

Hi.

I have just installed ooRexx-4.0.0.64.rte.bff.

But after 2-3 hours rxapi coredumps with following error /shown last)

My system is a 2 CPU with SMT enabled and 8Gb RAM.

I started removimg my old rexx (orexx 3.2), hen I installed the 4.0
version.

I can se, that a similar problem is reported (still open) on aix 5.3 (and
32 bit rexx). In this incident You talk abour problem with PATH, I do not
have any oorexx in the parh.

Any Idea what is wrong ?

/HGA

From errpr:
//core
PROGRAM NAME
rxapi
STACK EXECUTION DISABLED
0
COME FROM ADDRESS REGISTER

PROCESSOR ID
hw_fru_id: 4
hw_cpu_id: 33

ADDITIONAL INFORMATION
llocatorX 180
llocatorX 174
Unable to generate symptom string.

Discussion

  • Henning G

    Henning G - 2009-10-19

    Hi.

    This is my first problem repported. S I dont know how things normaly are working.
    When Can I expect someone are loking at the problem ?

     
  • Rainer Tammer

    Rainer Tammer - 2009-10-28

    Hello,
    this problem might be related to a possible compiler error.
    IBM is investigating this.

    It would be great if you could provide a test case which triggers this problem. I can not recreate this on my systems.

    Bye
    Rainer

     
  • Rainer Tammer

    Rainer Tammer - 2009-10-29

    Hello,
    it looks that there is a problem in has_ctl(). This function does belong to the OS/libs. IBM is currently writing an APAR for this problem. I will report back as soon as I get new information.

    Bye
    Rainer

     
  • Rainer Tammer

    Rainer Tammer - 2009-12-11

    Hello,
    the fix from jfaucher (5345) fixed the problem. It was no memory overlay as Rick suspected.

    Bye
    Rainer

     
  • Rainer Tammer

    Rainer Tammer - 2009-12-18

    Unfortunately the problem is not yet solved:

    It still do not work.

    The system stated looping after 20 hours, and it eats all available memory.

    So I have removed oorex4, nest try will be next year.

    Bye
    Rainer

     
  • Rainer Tammer

    Rainer Tammer - 2009-12-18

    Hello,
    I believe that the fix from jfaucher (5345) fixed the first problem with the rxapi crash. This sound more like the is a memory leak in the code.
    Bye
    Rainer

     
  • jfaucher

    jfaucher - 2009-12-20

    Rainer,
    I checked the memory usage of rxapi under Linux with Valgrind (when running the test framework), and nothing wrong was reported.
    Maybe the test framework does not cover all the services managed by rxapi and it would be interesting to see if ubihga is using other services.

    The test I made (I will attache the file valgrind_test) :

    APIService.cpp, put in comment #define RUN_AS_DAEMON

    APIServer.cpp, add printf to see the received ServiceMessages

    make install

    valgrind --leak-check=full --show-reachable=yes --track-origins=yes --leak-resolution=high --num-callers=40 rxapi > valgrind_output 2>&1

    open another console

    rexx testOORexx.rex

    when done, kill <valgrind PID=""> --> sigterm halts properly valgrind

    (of course, maybe i'm completely wrong with this test, only Rick can say :-)

    Jean-Louis

     
  • jfaucher

    jfaucher - 2009-12-20

    The modified source files and the results from Valgrind

     
  • Rainer Tammer

    Rainer Tammer - 2010-01-27

    Hello,
    sorry for the delayed answer... I just read your comment. Thanks for your test on Linux.

    I can not reproduce this either on my test boxes. I hope that the two affected users can provide a sample. I have run some of the examples several thousand times - no problem. Right now I am completely stuck.

    Maybe the problem manifests itself when ooRexx is embedded in a C/C++ program.

    Bye
    Rainer

     
  • Rainer Tammer

    Rainer Tammer - 2010-01-27

    Hello,
    currently valgrind is not supported on AIX. According to the NEWS file there sould be some experimental support in valgrind but right now I am not able to build it on AIX.

    Bye
    Rainer

     
  • Mark Miesfeld

    Mark Miesfeld - 2010-02-26

    Hmm, it looks like Rainer had closed this on 12/11/2009.

    Then, it either got reopened when Jean-Louis posted a comment, or possibly when I was sorting things out for the bug fix release.

    The current:

    Status == Pending
    Resolution == None

    doesn't make sense.

    Rainer, could you take a look at this and set status and pending to the proper values. If we are not sure if it is resolved, then I think the status should be open. But, since you closed it at one point, that's your call.

    If it needs the svn commit 5345 to fix it, then we can merge that commit into the 4.0.1 branch, if it hasn't already been done. If you're unsure of how to do the merge, I'll do it.

    Thanks.

     
  • Rick McGuire

    Rick McGuire - 2010-02-26

    I believe I merged that one in already.

     
  • Mark Miesfeld

    Mark Miesfeld - 2010-09-08

    This was merged into the 4.0.1 release by Rick.

     


Anonymous

Cancel  Add attachments





Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks