|
From: Bart V. A. <bar...@gm...> - 2009-07-23 08:06:11
|
Hello, While trying to port DRD to Darwin I noticed that on the Mac OS X system I used several POSIX threads functions behave differently than their Linux equivalents. Please note that the results below have been obtained on an old system (kernel 9.2.0). It would be great if someone could confirm whether the results below can be reproduced on a Mac OS X system with the latest updates installed. Note: the results below have been obtained by running native executables and hence are not related to the behavior of any Valgrind tool. ** drd/tests/pth_inconsistent_cond_wait* Linux: $ drd/tests/pth_inconsistent_cond_wait (empty output, as expected) Darwin: $ drd/tests/pth_inconsistent_cond_wait drd/tests/pth_inconsistent_cond_wait pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, &deadline) returned error code 22 (Invalid argument) pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, &deadline) returned error code 60 (Operation timed out) It is not clear to me why the first call to pthread_cond_timedwait() returns EINVAL on Darwin. As far as I know EINVAL means that either the condition variable, mutex or timeout passed to pthread_cond_timedwait() is invalid ? See also http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedwait.html . ** helgrind/tests/tc17_sembar* Linux: $ helgrind/tests/tc17_sembar starting done, result is 88, should be 88 (expected output) Darwin: $ helgrind/tests/tc17_sembar starting done, result is 99, should be 88 (not expected) I've also noticed that DRD reports race conditions in the drd/tests/circular_buffer test program on Darwin but not on Linux. Both the tc17_sembar and the circular_buffer test program trigger many semaphore calls. Bart. |
|
From: Alexander P. <gl...@go...> - 2009-07-23 11:13:01
|
I've tried the tests on Mac OS 10.5 (Darwin Kernel Version 9.7.0: Tue Mar 31 22:52:17 PDT 2009) Here are the results glider$ ./pth_inconsistent_cond_wait pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, &deadline) returned error code 22 (Invalid argument) this is a bit confusing. glider$ ./tc17_sembar starting done, result is 99, should be 88 this is expected, because sem_init is not implemented on Darwin, only named semaphores are supported. On Thu, Jul 23, 2009 at 12:06 PM, Bart Van Assche<bar...@gm...> wrote: > Hello, > > While trying to port DRD to Darwin I noticed that on the Mac OS X system I > used several POSIX threads functions behave differently than their Linux > equivalents. Please note that the results below have been obtained on an old > system (kernel 9.2.0). It would be great if someone could confirm whether > the results below can be reproduced on a Mac OS X system with the latest > updates installed. Note: the results below have been obtained by running > native executables and hence are not related to the behavior of any Valgrind > tool. > > * drd/tests/pth_inconsistent_cond_wait > > Linux: > > $ drd/tests/pth_inconsistent_cond_wait > (empty output, as expected) > > Darwin: > > $ drd/tests/pth_inconsistent_cond_wait > drd/tests/pth_inconsistent_cond_wait > pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, > &deadline) returned error code 22 (Invalid argument) > pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, > &deadline) returned error code 60 (Operation timed out) > > It is not clear to me why the first call to pthread_cond_timedwait() returns > EINVAL on Darwin. As far as I know EINVAL means that either the condition > variable, mutex or timeout passed to pthread_cond_timedwait() is invalid ? > See also > http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedwait.html. > > > * helgrind/tests/tc17_sembar > > Linux: > $ helgrind/tests/tc17_sembar > starting > done, result is 88, should be 88 > (expected output) > > Darwin: > $ helgrind/tests/tc17_sembar > starting > done, result is 99, should be 88 > (not expected) > > I've also noticed that DRD reports race conditions in the > drd/tests/circular_buffer test program on Darwin but not on Linux. Both the > tc17_sembar and the circular_buffer test program trigger many semaphore > calls. > > Bart. > ------------------------------------------------------------------------------ > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers > > -- WBR, Alexander Potapenko Software Engineer Google Moscow |
|
From: Alexander P. <gl...@go...> - 2009-07-23 11:22:53
|
By the way, pth_inconsistent_cond_wait.c uses sem_init() as well, so it's incorrect to run it on Darwin. However, replacing sem_init with the appropriate sem_open call does not help to fix pthread_cond_timedwait() On Thu, Jul 23, 2009 at 3:12 PM, Alexander Potapenko<gl...@go...> wrote: > I've tried the tests on Mac OS 10.5 (Darwin Kernel Version 9.7.0: Tue > Mar 31 22:52:17 PDT 2009) > Here are the results > > glider$ ./pth_inconsistent_cond_wait > pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, > &deadline) returned error code 22 (Invalid argument) > > this is a bit confusing. > > glider$ ./tc17_sembar > starting > done, result is 99, should be 88 > > this is expected, because sem_init is not implemented on Darwin, only > named semaphores are supported. > > On Thu, Jul 23, 2009 at 12:06 PM, Bart Van > Assche<bar...@gm...> wrote: >> Hello, >> >> While trying to port DRD to Darwin I noticed that on the Mac OS X system I >> used several POSIX threads functions behave differently than their Linux >> equivalents. Please note that the results below have been obtained on an old >> system (kernel 9.2.0). It would be great if someone could confirm whether >> the results below can be reproduced on a Mac OS X system with the latest >> updates installed. Note: the results below have been obtained by running >> native executables and hence are not related to the behavior of any Valgrind >> tool. >> >> * drd/tests/pth_inconsistent_cond_wait >> >> Linux: >> >> $ drd/tests/pth_inconsistent_cond_wait >> (empty output, as expected) >> >> Darwin: >> >> $ drd/tests/pth_inconsistent_cond_wait >> drd/tests/pth_inconsistent_cond_wait >> pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, >> &deadline) returned error code 22 (Invalid argument) >> pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, >> &deadline) returned error code 60 (Operation timed out) >> >> It is not clear to me why the first call to pthread_cond_timedwait() returns >> EINVAL on Darwin. As far as I know EINVAL means that either the condition >> variable, mutex or timeout passed to pthread_cond_timedwait() is invalid ? >> See also >> http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedwait.html. >> >> >> * helgrind/tests/tc17_sembar >> >> Linux: >> $ helgrind/tests/tc17_sembar >> starting >> done, result is 88, should be 88 >> (expected output) >> >> Darwin: >> $ helgrind/tests/tc17_sembar >> starting >> done, result is 99, should be 88 >> (not expected) >> >> I've also noticed that DRD reports race conditions in the >> drd/tests/circular_buffer test program on Darwin but not on Linux. Both the >> tc17_sembar and the circular_buffer test program trigger many semaphore >> calls. >> >> Bart. >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Valgrind-developers mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-developers >> >> > > > > -- > WBR, > Alexander Potapenko > Software Engineer > Google Moscow > -- WBR, Alexander Potapenko Software Engineer Google Moscow |
|
From: Bart V. A. <bar...@gm...> - 2009-07-23 19:28:59
|
On Thu, Jul 23, 2009 at 1:12 PM, Alexander Potapenko<gl...@go...> wrote: > I've tried the tests on Mac OS 10.5 (Darwin Kernel Version 9.7.0: Tue > Mar 31 22:52:17 PDT 2009) > Here are the results > > glider$ ./pth_inconsistent_cond_wait > pth_inconsistent_cond_wait.c:52 pthread_cond_timedwait(&s_cond, mutex, > &deadline) returned error code 22 (Invalid argument) > > this is a bit confusing. > > glider$ ./tc17_sembar > starting > done, result is 99, should be 88 > > this is expected, because sem_init is not implemented on Darwin, only > named semaphores are supported. Thanks for the help. After having replaced sem_init() by sem_open() in the pth_inconsistent_cond_wait test program, I still see the same puzzling error message: $ drd/tests/pth_inconsistent_cond_wait pth_inconsistent_cond_wait.c:77 pthread_cond_timedwait(&s_cond, mutex, &deadline) returned error code 22 (Invalid argument) Bart. |
|
From: Greg P. <gp...@ap...> - 2009-07-23 20:31:40
|
On Jul 23, 2009, at 12:28 PM, Bart Van Assche wrote: > Thanks for the help. After having replaced sem_init() by sem_open() in > the pth_inconsistent_cond_wait test program, I still see the same > puzzling error message: > $ drd/tests/pth_inconsistent_cond_wait > pth_inconsistent_cond_wait.c:77 pthread_cond_timedwait(&s_cond, mutex, > &deadline) returned error code 22 (Invalid argument) The test is using a single cond with two different mutexes at the same time. POSIX says that's undefined; Darwin detects it and returns EINVAL. http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_wait.html "When a thread waits on a condition variable, having specified a particular mutex to either the pthread_cond_timedwait() or thepthread_cond_wait() operation, a dynamic binding is formed between that mutex and condition variable that remains in effect as long as at least one thread is blocked on the condition variable. During this time, the effect of an attempt by any thread to wait on that condition variable using a different mutex is undefined." Libc/pthreads/pthread_cond.c: else if ((busy = cond->busy) != mutex) { /* Must always specify the same mutex! */ cond->waiters--; UNLOCK(cond->lock); return (EINVAL); } The comment at the top of the test suggests the test does this on purpose, and that drd is supposed to catch it first. Perhaps you're not intercepting all variants of pthread_cond_timedwait? UNIX conformance means some Libc functions have multiple versions. 0007da80 (__TEXT,__text) external _pthread_cond_timedwait 000e1787 (__TEXT,__text) external _pthread_cond_timedwait$NOCANCEL $UNIX2003 000589a8 (__TEXT,__text) external _pthread_cond_timedwait$UNIX2003 -- Greg Parker gp...@ap... Runtime Wrangler |
|
From: Bart V. A. <bar...@gm...> - 2009-07-24 11:28:42
|
On Thu, Jul 23, 2009 at 10:31 PM, Greg Parker <gp...@ap...> wrote: > > On Jul 23, 2009, at 12:28 PM, Bart Van Assche wrote: >> >> Thanks for the help. After having replaced sem_init() by sem_open() in >> the pth_inconsistent_cond_wait test program, I still see the same >> puzzling error message: >> $ drd/tests/pth_inconsistent_cond_wait >> pth_inconsistent_cond_wait.c:77 pthread_cond_timedwait(&s_cond, mutex, >> &deadline) returned error code 22 (Invalid argument) > > The test is using a single cond with two different mutexes at the same time. POSIX says that's > undefined; Darwin detects it and returns EINVAL. Thanks for the clarification. I have modified the drd/tests/pth_inconsistent_cond_wait.c test program such that it does no longer print any error messages when run as a regression test. DRD does now produce exactly the same output for this test program on Linux and on Darwin, which is good. When running DRD's regression tests on Darwin, 63 out of 64 regression tests always pass and one regression test, rwlock_test, fails sometimes. If that test fails it is because DRD reported one or more conflicting memory loads triggered by line 35, the statement that reads s_counter protected by a reader lock. I have not yet seen such behavior on Linux, so I should examine rwlock support in DRD on Darwin more closely. It would help if I could have a look at the implementation of POSIX reader-writer locks in Darwin. Is this source code available online and under a license that is compatible with the development of source code under the GPL ? Bart. |
|
From: Nicholas N. <n.n...@gm...> - 2009-07-24 19:27:19
|
On Fri, Jul 24, 2009 at 9:28 PM, Bart Van Assche<bar...@gm...> wrote: > > It would help if I could have a look at the > implementation of POSIX reader-writer locks in Darwin. Is this source > code available online and under a license that is compatible with the > development of source code under the GPL ? Go to http://www.opensource.apple.com/release/mac-os-x-1057/, look for "Libc-498.1.7". (For those who are interested, xnu-1228.12.14 is the kernel code and is also useful for Valgrind development, especially understanding syscalls.) It's available under APSL, BSD, MIT licenses so should be GPL compatible. Click on the arrow to download it. Then look in pthreads/pthread_rwlock.c. Nick |
|
From: Greg P. <gp...@ap...> - 2009-07-24 22:09:48
|
On Jul 24, 2009, at 4:28 AM, Bart Van Assche wrote: > When running DRD's regression tests on Darwin, 63 out of 64 regression > tests always pass and one regression test, rwlock_test, fails > sometimes. If that test fails it is because DRD reported one or more > conflicting memory loads triggered by line 35, the statement that > reads s_counter protected by a reader lock. Darwin's pthread_rwlock_init() reads a value from the pthread_rwlock_t struct before initializing anything. Valgrind doesn't like that. (They say it's required for Unix compliance; I say it's a bug and the standard requires no such thing.) -- Greg Parker gp...@ap... Runtime Wrangler |
|
From: Bart V. A. <bar...@gm...> - 2009-07-25 11:59:51
|
On Sat, Jul 25, 2009 at 12:09 AM, Greg Parker<gp...@ap...> wrote: > On Jul 24, 2009, at 4:28 AM, Bart Van Assche wrote: >> >> When running DRD's regression tests on Darwin, 63 out of 64 regression >> tests always pass and one regression test, rwlock_test, fails >> sometimes. If that test fails it is because DRD reported one or more >> conflicting memory loads triggered by line 35, the statement that >> reads s_counter protected by a reader lock. > > Darwin's pthread_rwlock_init() reads a value from the pthread_rwlock_t > struct before initializing anything. Valgrind doesn't like that. (They say > it's required for Unix compliance; I say it's a bug and the standard > requires no such thing.) With Nick's help I found the source code of the rwlock implementation: http://www.opensource.apple.com/source/Libc/Libc-498.1.7/pthreads/pthread_rwlock.c. The fact that pthread_rwlock_init() reads a value from the pthread_rwlock_t struct is indeed surprising -- this could cause initialization to be skipped for rwlock structures allocated on the stack. Since r10599 all DRD regression tests pass -- the regression test rwlock_test does no longer fail. Thanks for the help. I will now test DRD with larger applications. Bart. |
|
From: Bart V. A. <bar...@gm...> - 2009-07-26 13:54:22
|
On Sat, Jul 25, 2009 at 1:59 PM, Bart Van Assche<bar...@gm...> wrote: > Since r10599 all DRD regression tests pass -- the regression test > rwlock_test does no longer fail. Thanks for the help. I will now test > DRD with larger applications. Another question: are classes like NSThread, NSLock, NSCondition and NSConditionLock entirely implemented using the POSIX threads API ? I'm asking this because DRD reports much more data races than I expected for Mac OS X GUI applications. As an example, many data races are reported for both Terminal and Safari on functions called from inside CFRunLoop* functions, while the documentation of the NSRunLoop class specifies that the implementation of NSRunLoop is not thread-safe (http://developer.apple.com/documentation/Cocoa/Reference/Foundation/Classes/NSRunLoop_Class/Reference/Reference.html). Bart. |