|
From: Kristian V. D. V. <va...@li...> - 2008-11-01 13:58:45
|
On Sat, 2008-11-01 at 14:44 +0100, Stefano D'Angelo wrote:
> 2008/11/1 Kristian Van Der Vliet <va...@li...>:
> > On Sat, 2008-11-01 at 11:42 +0100, Stefano D'Angelo wrote:
> >> I investigated a bit into Syllable's source code and it seems like
> >> some weird stuff is going on... First, in pthread_exit(), this cycle
> >> seems to be broken:
> >>
> >> 231 while ( cleanup )
> >> 232 {
> >> 233 if ( cleanup->routine )
> >> 234 (*cleanup->routine) (cleanup->arg);
> >> 235 cleanup = cleanup->prev;
> >> 236 free( cleanup );
> >> 237 }
> >>
> >> since after the first loop, cleanup points to a freed memory location,
> >> which shouldn't be considered valid any more (unless you have very
> >> very strange memory handling routines/conventions, but I doubt that).
> >
> > Believe it or not that loop is actually O.K. It works backwards through
> > the list, calling the cleanup handlers and then freeing the structures
> > as it goes. For the first item in the list, prev will be NULL and
> > free(NULL) is a valid no-op, and the loop will then exit. I admit it is
> > non-obvious.
Anthony has pointed out to me that there is an issue here in that the
first cleanup structure in the list is never free'd, which is a valid
problem here.
>> Going back to the original problem, it seems like pthread_exit()
> >> completely destroys the thread, which is not a POSIX-compliant
> >> behaviour, since it won't be found by pthread_join()s happening after
> >> the thread has terminated.
> >>
> >> This would be ok for PTHREAD_CREATE_DETACHED threads, but the POSIX
> >> standards states that the default detachstate is
> >> PTHREAD_CREATE_JOINABLE.
> >
> > The OpenGroup spec says:
> >
> > "The pthread_join() function suspends execution of the calling thread
> > until the target thread terminates, unless the target thread has already
> > terminated. On return from a successful pthread_join() call with a
> > non-NULL value_ptr argument, the value passed to pthread_exit() by the
> > terminating thread is made available in the location referenced by
> > value_ptr. When a pthread_join() returns successfully, the target thread
> > has been terminated. The results of multiple simultaneous calls to
> > pthread_join() specifying the same target thread are undefined. If the
> > thread calling pthread_join() is canceled, then the target thread will
> > not be detached.
> >
> > [ESRCH]
> > No thread could be found corresponding to that specified by the
> > given thread ID."
> >
> > which is what the current implementation does: if pthread_join() is
> > called on a thread that has already called pthread_exit() it returns
> > ESRCH. However it seems I have been tripped up by "When a pthread_join()
> > returns successfully, the target thread has been terminated."! Reading
> > that now my interpretation is that pthread_join() should return
> > immediately when called on a thread that has exited, rather than
> > returning ESRCH.
> >
> > I *think* there is a way to fix this, sort of: wait_for_thread() will
> > return -ECHILD if the target thread does not exist, which can be
> > interpreted as "Thread has exited". However there is then no way to
> > detect cases where the thread ID is invalid I.e. to return ESRCH, but
> > this would seem to be a less important edge case.
>
> Well... it seems like the POSIX standard is a bit ambiguous (as usual
> :-P), but looking around I've always seen it implemented "the other
> way": pthread_join() will return 0 on terminated threads (that's for
> sure on Linux, FreeBSD, DragonFlyBSD and Haiku).
>
> However, it is my understanding that the original meaning of the
> standard is that you can have two types of threads:
> * PTHREAD_CREATE_DETACHED threads, which you just can't join or detach
> (thus can't get their return value) - those threads should be
> completely destroyed when they call pthread_exit();
> * PTHREAD_CREATE_JOINABLE threads (the default), which the system
> should keep their return value (hence reference to them) in memory
> until pthread_join() is called.
>
> I don't think any other interpration, even if valid according to the
> "literal standard", makes much sense.
No I think you're right, otherwise it's a potential race condition
between one thread exiting and the other calling join: the joining
thread could interpret the return from ESRCH as a fatal error and abort,
when it isn't.
> Don't misunderstand, I'm not trying to dictate what you should do, I'm
> just trying to help you (and me, porting that stuff).
No, this is very helpful. It's nice to have some feedback on this sort
of level. Without it I can't improve Syllable :)
--
Vanders
http://www.syllable.org
|