From: Rainer T. <ta...@ta...> - 2009-08-26 12:52:25
|
Hello, yug k wrote: > Hi All, > > I've had a core dump with rxapi while testing ooRexx 4.0.0 on AIX > 5.3.0.0 . I've raise a bug for the same couple of days ago bearing Bug > ID 2844093 on SF . Please find the core dump attached in this mail . > What is the output of: oslevel -s lslpp -L xlC.rte lslpp -L bos.rte.libc ulimit -a > I'm sending the core file by email because it exceeded the limit on > Source Forge . > OK, that's the first analysis: # dbx rxapi Type 'help' for help. [using memory image in core] reading symbolic information ... pthdb_session.c, 794: 0 PTHDB_INTERNAL (internal error) pthreaded.c, 1800: PTHDB_INTERNAL (internal error) Segmentation fault in _Incsize(unsigned long) at line 421 in file "/usr/vacpp/include/list" 421 _Size += _N; } (dbx) where pthdb_session.c, 794: 0 PTHDB_INTERNAL (internal error) pthreaded.c, 1800: PTHDB_INTERNAL (internal error) _Incsize(unsigned long)(this = 0x0000005c, _N = 1), line 421 in "list" _Insert(std::list<APIServerThread*,std::allocator<APIServerThread*> >::iterator,APIServerThread* const&)(this = 0x0000005c, _P = (...), _X = 0x30049d88), line 38 in "list.t" push_back(APIServerThread* const&)(this = 0x0000005c, _X = 0x30049d88), line 343 in "list" sessionTerminated(APIServerThread*)(this = (nil), thread = 0x30049d88), line 113 in "APIServer.cpp" <----- why is this = nil ??? dispatch()(this = 0x30049d88), line 57 in "APIServerThread.cpp" call_thread_function(void*)(argument = 0x30049d88), line 124 in "SysThread.cpp" _global_lock_common(??, ??, ??) at 0xd0111440 (dbx) > For more details about the Bug, please refer the Bug ID 2844093 at : > > https://sourceforge.net/tracker/?func=detail&aid=2844093&group_id=119701&atid=684730 > <https://sourceforge.net/tracker/?func=detail&aid=2844093&group_id=119701&atid=684730> > > Name : Yug > Email : yk....@gm... <mailto:yk....@gm...> > SF User : ykforums > > regards, > > Yug > > Bye Rainer |
From: Rainer T. <ta...@ta...> - 2009-08-26 19:41:05
|
Hello, Rick McGuire wrote: > But none of that code would modify the server pointer value in the > enclosing APIServerThread instance. The root problem is the server > field in that instance getting set to NULL. There are precisely 3 > references to that field in the entire code base. > > 1) The constructor for APIServerThread > 2) The first line of APIServerThread.dispatch(); > 3) The second line of APIServerThread.dispatch(); > > When this fails, we have a good value at 1&2, but a NULL value at 3. > No error processing anywhere in the server touches that field...even > the destructor for APIServerThread doesn't do anything. A memory > overlay of sometype still seems like the most likely explanation. > > Yes a memory overlap is most likely. But how is this possible ? Unfortunately there are no trace hooks in the code. This would simplify the whole ting. But could it be that this is a timing problem? That the sessionTerminated() gets called twice under heavy load? If I see this correctly not all return values of lock operations are checked. The this is (nil) then some lock operations should fail, or not ? Would it not look similar if the termination code gets cald on an already terminated instance? I have no idea where to put traces in the code... > Rick > > Bye Rainer |
From: Rick M. <obj...@gm...> - 2009-08-26 19:55:10
|
On Wed, Aug 26, 2009 at 3:40 PM, Rainer Tammer<ta...@ta...> wrote: > Hello, > > > Rick McGuire wrote: >> But none of that code would modify the server pointer value in the >> enclosing APIServerThread instance. The root problem is the server >> field in that instance getting set to NULL. There are precisely 3 >> references to that field in the entire code base. >> >> 1) The constructor for APIServerThread >> 2) The first line of APIServerThread.dispatch(); >> 3) The second line of APIServerThread.dispatch(); >> >> When this fails, we have a good value at 1&2, but a NULL value at 3. >> No error processing anywhere in the server touches that field...even >> the destructor for APIServerThread doesn't do anything. A memory >> overlay of sometype still seems like the most likely explanation. >> >> > Yes a memory overlap is most likely. But how is this possible ? > Unfortunately there are no trace hooks in the code. This would simplify > the whole ting. > > But could it be that this is a timing problem? > That the sessionTerminated() gets called twice under heavy load? If I > see this correctly > not all return values of lock operations are checked. The this is (nil) > then some lock > operations should fail, or not ? No it would not. sessionTerminatate() is called only by the thread being terminated, by the thread object that is terminating. The crash is occurring because sessionTerminated() is called using a null APIServer pointer. There are no synchronization issues involved here because the overlay has occurred before the point where locks are even obtained. Rick > > Would it not look similar if the termination code gets cald on an > already terminated instance? > > I have no idea where to put traces in the code... >> Rick >> >> > Bye > Rainer > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Oorexx-devel mailing list > Oor...@li... > https://lists.sourceforge.net/lists/listinfo/oorexx-devel > |
From: Rick M. <obj...@gm...> - 2009-08-26 13:02:27
|
Hmmm, something very strange going on here, most likely a memory overlay of some sort. From the stack trace, the nil value passed to sessionTerminated() is clearly wrong, which results in an access of 0x0000005c getting used for the list, causing the trap. My only possible explanation is some sort of memory overlay is clearing the server pointer in the APIServerThread instance. This value was clearly not zero at the beginning, otherwise things would have crashed much earlier. Rick On Wed, Aug 26, 2009 at 8:52 AM, Rainer Tammer<ta...@ta...> wrote: > Hello, > > yug k wrote: >> Hi All, >> >> I've had a core dump with rxapi while testing ooRexx 4.0.0 on AIX >> 5.3.0.0 . I've raise a bug for the same couple of days ago bearing Bug >> ID 2844093 on SF . Please find the core dump attached in this mail . >> > What is the output of: > > oslevel -s > lslpp -L xlC.rte > lslpp -L bos.rte.libc > ulimit -a > >> I'm sending the core file by email because it exceeded the limit on >> Source Forge . >> > OK, > that's the first analysis: > > # dbx rxapi > Type 'help' for help. > [using memory image in core] > reading symbolic information ... > pthdb_session.c, 794: 0 PTHDB_INTERNAL (internal error) > pthreaded.c, 1800: PTHDB_INTERNAL (internal error) > > Segmentation fault in _Incsize(unsigned long) at line 421 in file > "/usr/vacpp/include/list" > 421 _Size += _N; } > (dbx) where > pthdb_session.c, 794: 0 PTHDB_INTERNAL (internal error) > pthreaded.c, 1800: PTHDB_INTERNAL (internal error) > _Incsize(unsigned long)(this = 0x0000005c, _N = 1), line 421 in "list" > _Insert(std::list<APIServerThread*,std::allocator<APIServerThread*> >>::iterator,APIServerThread* const&)(this = 0x0000005c, _P = (...), _X = > 0x30049d88), line 38 in "list.t" > push_back(APIServerThread* const&)(this = 0x0000005c, _X = 0x30049d88), > line 343 in "list" > sessionTerminated(APIServerThread*)(this = (nil), thread = 0x30049d88), > line 113 in "APIServer.cpp" <----- why is this = nil ??? > dispatch()(this = 0x30049d88), line 57 in "APIServerThread.cpp" > call_thread_function(void*)(argument = 0x30049d88), line 124 in > "SysThread.cpp" > _global_lock_common(??, ??, ??) at 0xd0111440 > (dbx) > >> For more details about the Bug, please refer the Bug ID 2844093 at : >> >> https://sourceforge.net/tracker/?func=detail&aid=2844093&group_id=119701&atid=684730 >> <https://sourceforge.net/tracker/?func=detail&aid=2844093&group_id=119701&atid=684730> >> >> Name : Yug >> Email : yk....@gm... <mailto:yk....@gm...> >> SF User : ykforums >> >> regards, >> >> Yug >> >> > Bye > Rainer > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Oorexx-devel mailing list > Oor...@li... > https://lists.sourceforge.net/lists/listinfo/oorexx-devel > |
From: Rainer T. <ta...@ta...> - 2009-08-26 13:27:05
|
Hello, Rick McGuire wrote: > Hmmm, something very strange going on here, most likely a memory > overlay of some sort. From the stack trace, the nil value passed to > sessionTerminated() is clearly wrong, which results in an access of > 0x0000005c getting used for the list, causing the trap. My only > possible explanation is some sort of memory overlay is clearing the > server pointer in the APIServerThread instance. This value was > clearly not zero at the beginning, otherwise things would have crashed > much earlier. > > Yes, but unfortunately I am not deep enough in the inner workings of the interpreter to find a good explanation for this... Is it possible that this problem is related to thread synchronization ?? > Rick > > Bye Rainer |
From: Rick M. <obj...@gm...> - 2009-08-26 13:58:34
|
I doubt this a synchonization problem. The APIServerThread instance is only accessed by the local thread and really just has a few fields. The call in question occurs here: /** * Dispatch the newly created reader thread to do it's work. */ void APIServerThread::dispatch() { // just dispatch this back to the api server for handling server->processMessages(connection); server->sessionTerminated(this); } The server variable has a valid value on the call to processMessages(), but is NULL on the second. So something has somehow zeroed out that value. No other part of the API server has a reference to this object once the thread is spun off, so there are no synchronization issues. A wild store seems like the most likely cause, but those are notoriously difficult to dianose. Rick On Wed, Aug 26, 2009 at 9:26 AM, Rainer Tammer<ta...@ta...> wrote: > Hello, > > Rick McGuire wrote: >> Hmmm, something very strange going on here, most likely a memory >> overlay of some sort. From the stack trace, the nil value passed to >> sessionTerminated() is clearly wrong, which results in an access of >> 0x0000005c getting used for the list, causing the trap. My only >> possible explanation is some sort of memory overlay is clearing the >> server pointer in the APIServerThread instance. This value was >> clearly not zero at the beginning, otherwise things would have crashed >> much earlier. >> >> > Yes, > but unfortunately I am not deep enough in the inner workings of the > interpreter to find a good explanation for this... > Is it possible that this problem is related to thread synchronization ?? >> Rick >> >> > > Bye > Rainer > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Oorexx-devel mailing list > Oor...@li... > https://lists.sourceforge.net/lists/listinfo/oorexx-devel > |
From: Rainer T. <ta...@ta...> - 2009-08-26 14:20:41
|
Hello Rick, Rick McGuire wrote: > I doubt this a synchonization problem. The APIServerThread instance > is only accessed by the local thread and really just has a few fields. > The call in question occurs here: > > > /** > * Dispatch the newly created reader thread to do it's work. > */ > void APIServerThread::dispatch() > { > // just dispatch this back to the api server for handling > server->processMessages(connection); > server->sessionTerminated(this); > } > > The server variable has a valid value on the call to > processMessages(), but is NULL on the second. So something has > somehow zeroed out that value. No other part of the API server has a > reference to this object once the thread is spun off, so there are no > synchronization issues. A wild store seems like the most likely > cause, but those are notoriously difficult to dianose. > > Rick > > in server->processMessages(connection); In APIServer.cpp there is -> catch (std::bad_alloc &ba) { // this catches any C++ memory allocation errors, which we'll just return into a // memory failure result message. message.result = SERVER_ERROR; } This only sets the message.result - nothing else. Could it be that there is a memory leak in rxapi and we hit this ? And as a result we get the problem with sessionTerminated(this). Bye Rainer |
From: Rick M. <obj...@gm...> - 2009-08-26 14:34:17
|
But none of that code would modify the server pointer value in the enclosing APIServerThread instance. The root problem is the server field in that instance getting set to NULL. There are precisely 3 references to that field in the entire code base. 1) The constructor for APIServerThread 2) The first line of APIServerThread.dispatch(); 3) The second line of APIServerThread.dispatch(); When this fails, we have a good value at 1&2, but a NULL value at 3. No error processing anywhere in the server touches that field...even the destructor for APIServerThread doesn't do anything. A memory overlay of sometype still seems like the most likely explanation. Rick On Wed, Aug 26, 2009 at 10:20 AM, Rainer Tammer<ta...@ta...> wrote: > Hello Rick, > > Rick McGuire wrote: >> I doubt this a synchonization problem. The APIServerThread instance >> is only accessed by the local thread and really just has a few fields. >> The call in question occurs here: >> >> >> /** >> * Dispatch the newly created reader thread to do it's work. >> */ >> void APIServerThread::dispatch() >> { >> // just dispatch this back to the api server for handling >> server->processMessages(connection); >> server->sessionTerminated(this); >> } >> >> The server variable has a valid value on the call to >> processMessages(), but is NULL on the second. So something has >> somehow zeroed out that value. No other part of the API server has a >> reference to this object once the thread is spun off, so there are no >> synchronization issues. A wild store seems like the most likely >> cause, but those are notoriously difficult to dianose. >> >> Rick >> >> > in server->processMessages(connection); > > In APIServer.cpp there is -> > catch (std::bad_alloc &ba) > { > // this catches any C++ memory allocation errors, which > we'll just return into a > // memory failure result message. > message.result = SERVER_ERROR; > } > > This only sets the message.result - nothing else. > Could it be that there is a memory leak in rxapi and we hit this ? > And as a result we get the problem with sessionTerminated(this). > > Bye > Rainer > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Oorexx-devel mailing list > Oor...@li... > https://lists.sourceforge.net/lists/listinfo/oorexx-devel > |