#25 Semaphore problems

open
nobody
None
2
2004-09-28
2003-06-13
No

If mpg321 is killed in the wrong way, it can leave its semaphores
sitting around locked, and is unable to play again (because it
requests an exclusive semaphore create the next time it runs). I've
triggered the bug twice, mostly during transient network hangs, or by
sending several signals together that kill it during its cleanup
operations.

It's also possible to kill mpg321 before it finishes opening a remote
file to reproduce this problem. This occurs in part because the
signal handlers aren't set up to register an actual exit before the file
starts playing.

One easy way to reproduce is to ensure that startup fails the shmget
initialization; the exit code in that situation does not RMID the
semaphore, or on fork failure, etc., etc. I think this is endemic to the
error handling style used in the main mpg321.c file...
this also reveals a few other bugs with the semaphores; the
semarray declaration, for instance, shouldn't be defined in the .h file,
only declared. (Ditto for decode pos and play pos) There are
numerous other places where this bug can be triggered.

I think an atexit handler might be the easist way to centralize the
semaphore cleanup, or a strategically placed goto. I've attached a
patch that puts it all in an atexit handler. Kill -9'ing the mpg321
process will still result in a lost semaphore, but at least the atexit
handling is an improvement. Patch for this attached. It's still
possible to cause mpg321 to coredump in ao_play() by killing at at
the wrong time, but at least it cleans up its semaphores correctly.

A related bug is that because ftok() is used to generate the ID of the
(private) semaphore, only one copy of mpg321 can be used on the
system at a time. Makes it impossible to use with multiple sound
cards concurrently.

Best solution I can think of is to either create a unique value to pass
to ftok, since the semaphore is only used for communication between
the producer and consumer processes.

Discussion

  • Logged In: YES
    user_id=2636

    I just can't attach files to save my life, can I? Patch attached.

     
  • Improve semaphore cleanup on abnormal exit

     
    Attachments
  • Logged In: YES
    user_id=110462

    The semaphore code is still to be considered beta. However,
    thanks for the idea of atexit (), I didn't think about it. I've also
    done some changes in the initialization section, i.e. use
    IPC_PRIVATE and more. Not already in the CVS, you'll have to
    wait. Or ask if you want to test the code.

     
  • Logged In: YES
    user_id=712646

    I am continueing development in order to get a new release
    out that picks up a number of the patches and closes a bunch
    of bugs. However, I am starting from the CVS version that
    does not use the multi-process IO, since it never worked in
    CVS.

    SInce I plan on adding bufferring later, I'm leaving this
    bug for now. I might pick up the patch later if I decide to
    continue with exiting code for buffering.

     
    • priority: 5 --> 2