#20 pthread issues

open
John Skaller
8
2006-08-13
2006-08-13
Markus Elfring
No

I want to continue the discussion from the bug report
"Check return codes everywhere"
(https://sourceforge.net/tracker/?func=detail&atid=394143&aid=1538607&group_id=28597)
about specific details in this request.

1. Would you like that your wrapper for will conform to
the standard "IEEE Std 1003.1, The Open Group Base
Specifications Issue 6"?
http://www.opengroup.org/onlinepubs/009695399/basedefs/pthread.h.html
Do I get wrong expectations from the instruction
"#include <pthread.h>"?

2. There seem to be some limitations.
Example:
http://www.opengroup.org/onlinepubs/009695399/functions/pthread_detach.html
"[...] Felix uses distinct classes: my design is
better, given a constraint that you can't detach a
thread after it is created. [...]"

3. I would prefer that the abstraction classes should
be separated from the operating system wrappers into an
other file package.
http://felix.cvs.sourceforge.net/felix/lpsrc/flx_pthread.pak?revision=1.40&view=markup

4. How do you deal with the binding specification
'extern "C"' that is required for the "start routine"
(in C++)?
http://www.opengroup.org/onlinepubs/009695399/functions/pthread_create.html

5. Does your wait interface for condition variables
provide protection against spurious wakeups?
http://groups.google.de/groups?threadm=40ed1d8f.0411191313.4dff837c@posting.google.com

6. Would you like to add support for spin locks and
thread-local storage?

7. All accesses to the attribute
"worker_fifo::nthreads" need memory synchronization
(mutex).

8. It is a pity that most class libraries do not fit to
your requirements. (licence, ...)

9. Do you contribute your experience as a member of the
C++ Standardisation committee to the topic "Simplifying
And Extending Mutex and Scoped Lock Types For C++
Multi-Threading Library"
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2043.html)
by Ion Gaztañaga?

Discussion

1 2 > >> (Page 1 of 2)
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "1. Would you like that your wrapper for will conform to
    the standard [] Do I get wrong expectations from the
    instruction "#include <pthread.h>"?"

    Felix doesn't provide <pthread.h>: that include is including
    the system pthread.h file on what we believe is a Posix
    compliant platform. If pthread.h doesn't conform to Posix it
    isn't our fault (though we may have to workaround such a
    limitation).

    We actually provide, in directory rtl, these headers:

    pthread_condv.hpp pthread_sleep_queue.hpp
    pthread_counter.hpp pthread_thread.hpp
    pthread_monitor.hpp
    pthread_win_posix_condv_emul.hpp
    pthread_mutex.hpp pthread_work_fifo.hpp
    pthread_semaphore.hpp

    however at present, these are not really intended to be used
    by end users -- they're there to support the rest of the
    Felix system.

     
  • John Skaller
    John Skaller
    2006-08-13

    • priority: 5 --> 8
    • assigned_to: nobody --> skaller
     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "1. Would you like that your wrapper for will conform to
    the standard "IEEE Std 1003.1, The Open Group Base
    Specifications Issue 6"?"

    Now, if you're refering to the *Windows* code which wraps
    Windows API calls, and provides 'Posix semaphores' .. it
    would indeed be useful if this emulation was Posix
    compliant. This is a special case where the Felix semaphore
    support is based on Posix, and the implementation assumes it
    has a Posix compliant API to work with. The Windows
    emulation should be compliant *except* for error codes (if
    it isn't, its a bug that needs fixing).

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "2. There seem to be some limitations.
    Example:
    http://www.opengroup.org/onlinepubs/009695399/functions/pthread_detach.html
    "[...] Felix uses distinct classes: my design is
    better, given a constraint that you can't detach a
    thread after it is created. [...]"

    Abstraction always involves limitations. The wrapper classes
    don't provide 1-1 mapping to Posix functionality. That's
    deliberate. For a start, whatever is provided must be
    supportable on Windows as well.

    In addition, not all the Posix functionality is useful, and
    some of it is a compromise. In the case of detaching a
    joinable thread there is a choice: the C++ wrapper I provide
    make the choice between detached or joinable part of the
    type. This makes it impossible to call 'join()' on a
    detached thread, since the type system enforces that.

    Since the type based distinction is necessarily static,
    there's no way to detach a joinable thread without risking a
    call to join() which would no longer work.

    If we really wanted joinable threads which could be
    dynamically detached it would be easy enough to refactor the
    class design so there were THREE classes.. but such need has
    not arisen yet.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "3. I would prefer that the abstraction classes should
    be separated from the operating system wrappers into an
    other file package."

    That could probably be done, and probably makes sense.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "4. How do you deal with the binding specification
    'extern "C"' that is required for the "start routine"
    (in C++)?"

    Woops! I had:

    static void *start_wrapper(void *e)

    This is wrong. Hmm .. changed to:

    extern "C" static void *start_wrapper(void *e)

    which works with Linux/g++ 4.0x but I'm not sure you can
    actually say both extern "C" and also static .. :)

    In this case I don't want an external name at all .. static
    is desirable. But C++ doesn't provide a way to change the
    calling conventions OTHER than extern "C" (which isn't
    guaranteed to do it either :)

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "5. Does your wait interface for condition variables
    provide protection against spurious wakeups?"

    The C++ level interface doesn't, because C++ is too dumb to
    provide closures. Function objects (functoids) would make
    possible, but they're much too hard to use.

    The Felix level wrapper will provide this protection (but
    there is no such wrapper at the moment). Felix has no
    problem generating the required applicative classes, indeed
    it is designed to overcome the weaknesses of C and C++ in
    this area.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "6. Would you like to add support for spin locks and
    thread-local storage?"

    Not sure. Spin locks may be useful, I'll have to look at
    what they do and where we might use them. Got a reference on
    that one? :)

    TLS, in general, is a hack to fix legacy C code that wasn't
    written with threads in mind -- including the really dumb
    'errno' rubbish in the C library, which, perhaps
    unfortunately, Posix continues to support.

    Since all OUR code is re-entrant, it has no direct utility:
    we don't use global storage (except in one hack :)

    However it may be useful for supporting clients who have
    legacy code to support.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "7. All accesses to the attribute
    "worker_fifo::nthreads" need memory synchronization
    (mutex)."

    Ah, thanks! Fixed.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "8. It is a pity that most class libraries do not fit to
    your requirements. (licence, ...)"

    The biggest problem isn't the licence, although that can be
    a hassle. Its the way the libraries are built and
    distributed. If everyone had Debian, it would be easier. If
    the developers lived for ever and fixed bugs quickly, and
    always agreed with sensible requirements .. and wrote
    platform independent build scripts ..

    sure, yes, that would be great!

    Elkhound: Scott McPeak's GLR parser, which is integrated
    into Felix. Written in archaic C++. Scott is good fixing any
    bugs, but he's NOT willing to bring his code up to
    industrial standards so I had to fork it.

    Tre: C which doesn't compile under C++. Uses the autotools
    which is a pain, uses libtool which is a disaster: broken on
    Cygwin. No support for building Windows native. My fork is
    edited to compile under C++ and builds on all platforms.
    It's also 2 patch levels out of date -- this is a pain, but
    what can I do?

    Well actually, Erick has provided C compiler support now,
    but we haven't got around to deploying it. This might allow
    us to ship a copy of TRE, unmodified (but we still have to
    write our own config script).

    Cil: forked. I can't expect the Cil developers to change
    their code to do what I need. Cil itself forks FrontC, the C
    parser :)

    My experience over 3 decades is: never trust any third party
    code. Avoid it at almost all costs. Its better to have
    control, even if it means more work. Occasionally, there may
    be an exception to this rule, such as a widely used public
    domain package which is supported by many people some way.

    If ONLY people actually adhered to Standards, AND the
    Standards were good, this problem would be reduced.. but C
    is an exceptionally bad programming language and it is
    unlikely any significant C code is ever trustworthy.

    C++ is better .. but not enough, and it has too many
    weaknesses and the OO is used far too heavily.

    Compared to Ocaml code, which I would be much happier to use
    'as is': the language is stronger, the users are smarter,
    and the build requirements are minimal (it just works on all
    platforms). Ocaml code is fairly reliable .. but I still
    don't use a single third party library -- I do use Cil, but
    not in the Felix compiler, only in the wrapper generator,
    which is less critical -- and it is my own fork of it.

    So .. if EVERYONE used, for example, Debian Linux distro, I
    might indeed use third party libraries .. because I can rely
    to some extent on the Debian package management system and
    autobuilder to distribute the code to clients without much
    fuss -- that would reduce problems to bugs in the code
    itself, rather than problems building it.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "9. Do you contribute your experience as a member of the
    C++ Standardisation committee to the topic "Simplifying
    And Extending Mutex and Scoped Lock Types For C++
    Multi-Threading Library"
    (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2043.html)
    by Ion Gaztañaga?"

    No, I resigned from the Committee a while ago and don't
    participate in C++ development anymore .. basically I gave
    up, and decided it would be more productive to fix the
    problems by designing a new language :)

    The paper you cite is interesting, and I would certainly
    look at using a emulation of an interface proposed and
    likely to be accepted into the Standard.

     
  • Markus Elfring
    Markus Elfring
    2006-08-13

    Logged In: YES
    user_id=572001

    1. I am concerned about your Pthreads-Win32 implementation.
    It seems that it is incomplete and does not conform to the
    standard yet. Pthreads-Unix implementations are more compliant.
    I guess that this is a big source of errors and doubts about
    reliability.

    4. This update might be another error.
    The function "pthread_create" expects to call other C
    functions and not static C++ member functions.
    Would you like to look at
    "https://ssl.simtec.mb.uni-siegen.de/tracThreadsPP/trac.cgi/file/lib/Thread.cc"
    for an example? (Ask "michael.weitzel@uni-siegen.de" for access)

    5. Please provide a "wait_until" interface that checks for
    the expected predicate in a loop.
    http://www.opengroup.org/onlinepubs/009695399/functions/pthread_cond_wait.html

    6. links:
    http://en.wikipedia.org/wiki/Spinlock
    http://en.wikipedia.org/wiki/Thread-local_storage

    10. How much are you interested in the support of recursive,
    read-write and process-shared mutexes?

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "1. I am concerned about your Pthreads-Win32 implementation.
    It seems that it is incomplete and does not conform to the
    standard yet. Pthreads-Unix implementations are more compliant.
    I guess that this is a big source of errors and doubts about
    reliability."

    Can you be more specific? There is no Pthreads-Win32
    implementation. We have two C++ classes:

    flx_thread_t // joinable
    flx_detached_thread_t // detached

    for threads. The Win32 versions of this are implemented
    directly using Window API. Since Pthreads are defined in
    terms of the C language, the behaviour of these classes
    isn't covered by Posix so they're non-compliant simply
    because they're C++.

    For condition variables, there is an emulation of Posix.

    This code was copied from an experts code, and I believe it
    is the emulation used in ACE. I didn't write this code. See:

    http://www.cs.wustl.edu/~schmidt/win32-cv-1.html

    for details. The principal bug in this code appears to be
    that it doesn't return Posix compliant error codes. However
    this doesn't matter as such, since the code is there simply
    to allow

    class flx_condv_t

    to be implemented.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "4. This update might be another error.
    The function "pthread_create" expects to call other C
    functions and not static C++ member functions."

    In the new code this is what happens. The Posix version
    of the wrapper is declared:

    extern "C" static void *start_wrapper(void *e)

    The Windows version is now declared:

    DWORD WINAPI start_wrapper(LPVOID e)

    Both are C functions, not static members.

    The name 'start_wrapper' could be unfortunate, in the
    Windows case the name probably has external linkage, which
    we do NOT want because it might cause conflicts. In the
    Posix case it's static, but I'm not sure if this will work
    with 'extern "C"' for all C++ compilers. It seems to work
    with g++ though. I don't know if the ISO C++ Standard
    permits this. A safe fallback is probably some weird name.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "5. Please provide a "wait_until" interface that checks for
    the expected predicate in a loop."

    It is possible to wait now, using the same technique as in
    C: the client has to write the loop and check the predicate.

    This sucks. We'd really like 'wait_until(P)', and in Felix
    that what you'll get. If you look at the module Pthread, you
    see there is NO support yet for condition variables in Felix
    : I haven't written it.

    The reason for not providing this in the C++ classes is as I
    explained before: C++ is a rather brain dead language.
    Although it is trivial to provide 'wait_until(P)' where P is
    an applicative object.

    It is hard to define such a class: since you can't use a
    function local class, nor a nested class, and expect it to
    bind to the current context, you have to write a class in a
    non-local context, and pass it all the required arguments.

    Doing all this is so much work, it isn't worth the effort.
    Its much easier to just write the condition test inline,
    because that *does* bind to the current context.

    This is not to say we cannot provide such an interface .. I
    just can't see myself using it.

    The key problem is deciding what type to use for a predicate
    .. since the wait_until() function isn't a template, there
    has to be a universal abstract base from which to derive
    your predicate: but there's no such type in the C++ Standard
    AFIK.

    Felix solves this problem by allowing nested everything, and
    the function type is simply

    unit -> bool

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    Spinlocks: These, and other 'advanced' synchronisation stuff
    are of interest .. but I don't know what we'd use them for
    in Felix at the moment. My company Async P/L is quite
    interested in OS/processor specific highly optimised
    techniques for making multi-processors useful. At the moment
    typical applications actually run slower on my dual core
    than my single core (and the x2 machine has a higher clock
    speed and double the RAM). A related idea to spinlocks is to
    use uncached memory for locking .. when you need to
    synchronise SOME memory, but not all of it.

    In particular, a CMT (Composeable Memory Transactions)
    implementation may benefit. This is a memory equivalent of
    the database 'commit rollback' idea, which basically avoids
    locking at the expense of having to retry the whole
    transaction in the unlikely event there is contention.

     
  • John Skaller
    John Skaller
    2006-08-13

    Logged In: YES
    user_id=5394

    "10. How much are you interested in the support of recursive,
    read-write and process-shared mutexes?"

    We're interested in anything that improves performance and
    safety. At the moment I do not see any use for recursive
    mutex -- it looks to me like anytime you think you need one
    you actually have a design error. I could be wrong. We did
    have one at once stage but I got rid of it :)

    RW mutex could be interesting, don't know.

    Process level interaction is a whole new topic!

     
  • John Skaller
    John Skaller
    2006-08-14

    Logged In: YES
    user_id=5394

    Serialisation of access to nthreads changed: mutex is used
    in the get_nthreads and set_nthreads methods instead.

    * Serialisation is not required in the constructor, because
    the object isn't available for external use until
    construction is completed.

    * Serialisation isn't required in the start and stop methods
    because they're private and can't be accessed externally.

    * Serialisation of the destructor would imply a contention
    between a client thread changing the size of the worker
    pool, and another trying to destroy it altogther, which is a
    design problem which can't be solved by serialisation.

    [It can be solved by garbage collection]

    * set_nthreads is the only public routine left that can
    change the number of threads. A contention here may or may
    not be a design error, however the implementation expects
    the nthreads value to change monotonically and may fail
    otherwise. So serialisation of access to nthreads variable
    isn't enough.

    Instead, the whole of set_nthreads is protected by a mutex,
    so only one adjustment to the thread count at a time is
    possible. get_nthreads is also protected, so it cannot be
    read whilst the thread count is being adjusted.
    Unfortunately this prevents monitoring the progress of
    set_nthreads. TWO mutexes would solve this problem.

    The right way to do this is probably to provide an unsafe
    class, and wrap it with a thread safe version, as is done
    with the garbage collector and a couple of other classes.
    However this class isn't performance critical (the collector
    is).

     
  • John Skaller
    John Skaller
    2006-08-14

    Logged In: YES
    user_id=5394

    "4. Please choose a unique exportable function name. The key
    word "static" should be removed from "start_wrapper"."

    Thanks for the links. Seems like

    extern "C" {
    static ...
    }

    should be OK, but I'll check on comp.std.c++.
    This is better than a guessing a unique name,
    but that's a reasonable fallback: we have
    a few extern "C" names anyhow, and even C++
    names can clash .. we reserved 'flx_' prefix
    but it is still best to avoid clashes is possible
    (but not at the expense of the program being ill formed).

    "Are there more callback functions for a correction?
    ("thread_start"?)"

    There are no others for pthread_create. However,
    there might be in other places in the RTL.

    There is actually a *language feature* for
    generating callback wrappers ... however I think
    it uses external linkage.. not sure .. not even
    sure I put the extern "C" around these callbacks :)

     
  • John Skaller
    John Skaller
    2006-08-14

    Logged In: YES
    user_id=5394

    "7. The monitoring of the thread count is not prevented after
    your addition of proper synchronisation."

    In the latest revision, it is prevent *during* adjustment of
    the thread pool size: it takes a mutex, and the whole
    adjustment of the thread pool takes the same mutex. So you
    can't monitor the thread count during the adjustment.

    "Can it be that the comment "The class is actually a thread
    plus a job queue." in the code should be adjusted?
    http://en.wikipedia.org/wiki/Thread_pool_pattern"

    You are correct the comment is wrong. This worker_fifo USED
    to be a single thread. I turned it into a pool of threads ..
    didn't fix the comment .. so much for literate programming :)

     
  • Markus Elfring
    Markus Elfring
    2006-08-14

    Logged In: YES
    user_id=572001

    4. Do you support namespaces for classes in your library?

    7. Well, this kind of "lockout" (for a short moment) is an
    intended way to achieve a consistent state. Would you like
    to consider alternative techniques to lower your worries
    about contention?
    http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms

     
  • Markus Elfring
    Markus Elfring
    2006-08-19

    Logged In: YES
    user_id=572001

    11. Would you like to integrate anything from the SR
    programming language?
    http://www.cs.arizona.edu/sr/doc.html

     
  • John Skaller
    John Skaller
    2006-08-20

    Logged In: YES
    user_id=5394

    "4. Do you support namespaces for classes in your library?"

    Yes they are all in (hopefully) appropriate namespaces.
    For example the collector abstraction is in

    flx::gc:generic

    and the collector provided is in

    flx:gc::collector

    The generated code also goes in a namespace,

    flx::user::<modulename>

    from memory. Only extern "C" functions are put at
    top level.

     
  • John Skaller
    John Skaller
    2006-08-20

    Logged In: YES
    user_id=5394

    "lock free"

    Not sure what this means. The wikipedia article is too
    waffly to really get any idea what you're asking about.

    Felix provides pre-emptive threading with channels for
    communication. Channels are a synchronisation primitives.
    You can use them instead of locks, and you can use them to
    implement locks. Our implementation of channels uses the C++
    monitor_t class, which itself uses mutex for
    synchronisation, on both Windows and Posix platforms.

    Certainly, there may be some other less widely available
    primitives. Windows Vista for example provides a rich set of
    primitives (way beyond Posix). Linux may do so also, and gcc
    may provide some support perhaps via asm. And then there is
    always the possibility of processor, board, and/or kernel
    dependent hacks/extensions.

    Some of the data structures could probably be made 'better'
    with alternate primitives.

    Posix is very bad here. Mutex is a control synchronisation
    primitive. Unfortunately, Posix provides no data
    synchronisation primitives .. so it uses mutex for that too,
    and the synchronisation is global (the whole address space).
    This is a serious design flaw in Posix.

    Felix itself currently provides only crude ways to select
    beteen alternatives:

    (a) the build system makes the choice (eg: epoll or select?)
    (b) conditional compilation
    (c) selective float insertions

    The latter is a kind of conditional compilation I
    implemented recently .. however it turns out not to do what
    I want.

    Conditional compilation sucks bigtime because it is spagetti
    . This is when you have a subroutine that does one job, and
    you need two jobs done. Because it is a big subroutine, you
    add a flag, and in many places in the subroutine you execute
    code conditionally on the flag. This is also called
    'threaded code'. What you really wanted was two subroutines,
    and needed to refactor them to avoid too much duplication ..
    but threading is a quick and dirty hack.

    Conditional compilation is like that .. you have a
    monolithic file and conditionally insert or reject bits of
    text. This is very bad.

    There are alternatives. Felix uses dependency analysis to
    yank in C headers required by used operations automatically.
    This is an example of a superior technique. However it isn't
    as strong as conditional compilation: there's no way to yank
    in Linux code on Linux, and Windows code on Windows, except
    by calling Windows only functions on Windows, and Linux only
    functions on Linux -- when what you wanted was to call a
    single routine, and implemented it in one of two ways
    depending on the platform.

    I need to find a proper way to do this, before heavy uses of
    architecture dependent primitives is easy: so actually the
    deeper issue with 'lock free' is a pure high level language
    design issue unrelated to threading issues (except that
    spagetti code is also sometimes called threaded code :)

     
1 2 > >> (Page 1 of 2)