There are some aspects on the initialization code that really need a rework.  There's definitely a race condition there just getting the main semaphore created that's used to syncrhonize everything.  I've got a whole laundry list of things I intend on cleaning up in this area, but I don't have enough of the other stuff I'm dependent upon finished yet. 

One potential work around would be to keep a single RexxStart session always active...maybe by calling back into an exit or function in your server that just puts the thread in a wait state.  Once that's done, the initialiazation race condition will be avoided.

Rick

On Dec 23, 2007 11:46 AM, Mike Cowlishaw <MFC@uk.ibm.com> wrote:
Thanks for the quick response!

> > OK, I have now got to the point where the finger is
> > definitely pointing at
> > ooRexx.   If I serialize calls to RexxStart by putting a
> > mutex around it,
> > everything runs perfectly.   If unserialized,  the segmentation fault
> > occurs while Rexx is starting up and before any lines in the Rexx file
> > seem to be interpreted.    Here's what GDB shows:
> > ...
> > The routine it fails in is almost trivial, but says:    /* This method
> > should be called within a critical section */  and there are various
> > things in RexxInitialize that probably do the right thing
> > (this works fine
> > in Windows) but they depend on conditional code (#if SHARED
> > and the like).
> > So I'm wordering if the .deb package was built without some vital
> > compile-time option?
>
> Mike,
>
> That is possible.  But, Ubuntu is Linux and the compilation is done
> using the standard configure mechanism and gcc.  So, in theory, it
> should be compiled the same as if it were compiled on Fedora or SuSE.
>
> That doesn't mean that there isn't a problem with a compile-time
> option of course.  Just that I think the error will also show up if
> you were working on a SuSE or Fedora system.

Absolutely -- I assume that it would, on any Linux.  [I should have said:
I wonder if the Linux packages(s) ...]

This looks like exactly the same problem I had when porting to the Nokia
N800 earlier this year, and that was under Debian (on x86) and Maemo (on
ARM).  I ran out of time (and the tools were inadequate) then, to
investigate -- but worked around it by making the web server
single-threaded (which was fine for a single-user application).  So this
happens on at least 3 Linux environments.

I have now re-ported the whole thing (with a complete review/rewrite of
the threading and critsec domains) from scratch .. and the problem is the
same.  I have also now kept the application multi-threaded, and the only
difference between 'works' and 'crashes' is a mutex lock around the
RexxStart call so that calls to ooRexx are serialized.

> Rick has opened up  at task, I believe, to remove as much of the
> conditional compile code as possible.  (Tasks are where we are keeping
> track of things we are planning on doing.)
>
> Since it looks like the segmentation fault is in ooRexx, we definitely
> want to fix it.  The probability is highest that Rick will fix it, but
> one of these days I might fix a hard one myself.  <grin>

:-))

> You could open a bug in tracker. If you had some sample code that
> caused the problem and attached it to the bug that would be ideal.
> Almost every bug that has some simple program attached that
> demonstrates the bug has been fixed in a short time.

I can let you have a VMWare appliance that fails every time (it's about
2GB, zipped).  Since this is a timing-related multi-thread thing, in my
experience trying to make a failing small testcase takes much longer than
applying some thought-experiment.

I'll open a bug if that helps, but I'm happy to help out too -- I just
need some suggestions to try (reverse engineering the ooRexx kernel isn't
really a good way to address this kind of -- probably very simple --
problem).

Mike











Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU







-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel