Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#52 SVN 2006-09-02 - Silent crash

CVS
closed-accepted
capture (20)
6
2006-12-28
2006-09-02
Jesse Litton
No

You might already know about this, but just in case:

The current SVN version silently exits after recording
for a moment or two. I see the following assert if I
run it from the command line:

xvidcap: pthread_mutex_lock.c:108:
__pthread_mutex_lock: Assertion `mutex->__data.__owner
== 0' failed.
Aborted

This is on Kubuntu 6.06.1 (64-bit), compiled with gcc
4.0.3. Codec and format are both set to auto, audio
disabled.

Thanks,
-J

Discussion

1 2 3 4 > >> (Page 1 of 4)
  • Logged In: YES
    user_id=782084

    (update, accepted)
    Hi,
    yep, this is a major annoyance. Haven't gotten to the bottom
    of this, as it is not always reproducable for me.
    I have the feeling I'm only seeing it on my P4 HT enabled
    with SMP kernel.
    What CPU are you on? Is it a Hyperthreading enabled or
    Multicore? Could you try with a non-SMP kernel?

    Since this is an assertion from within libpthread, I don't
    quite know if I can do anything about it, or whether it
    needs to be fixed upstream.

     
    • priority: 5 --> 3
    • assigned_to: nobody --> charly4711
    • status: open --> open-accepted
     
  • Jesse Litton
    Jesse Litton
    2006-09-03

    Logged In: YES
    user_id=210111

    Well, I am using an AMD4400x2 (dual-core) processor... but
    I'm not completely convinced it's a bug in libpthread just
    yet (but I've been wrong before <g>).

    Since I seem to be able to consistently recreate the
    problem, I'll take a crack at throwing some debugging output
    into my local copy and see what falls out. I don't have
    anywhere near your Linux expertise, but another set of eyes
    can't hurt.

    Linux pluto 2.6.15-26-amd64-k8 #1 SMP PREEMPT Thu Aug 3
    03:11:38 UTC 2006 x86_64 GNU/Linux

     
  • Logged In: YES
    user_id=782084

    didn't say I was "convinced" ... just that it might.

    If it's reproducible for you, testing with non-SMP kernel
    might help to rule it out.
    Have been reading about this error message in a context
    where threads on two CPUs would get a lock on a mutex at the
    same time. This should not be possible with a non-SMP kernel.

     
  • Jesse Litton
    Jesse Litton
    2006-09-04

    Logged In: YES
    user_id=210111

    I looked into it a little bit last night, and must admit to
    being very perplexed. The assert doesn't seem to be
    generated at the same time xvidcap is doing any of its
    locking operations, which I didn't expect. Don't pay
    attention to the specific line numbers of my additional
    debugs below (because they no longer correspond to the
    original source), but the sequence below is what I'm seeing
    before the crash. Also, ignore the fact that it says it's
    in xvc_job_validate()... that was just the last function to
    define DEBUGFUNCTION.

    The code seems to return from every lock operation
    successfully... At first I thought I might not be seeing
    the right info because some of the debugging output was
    going to stdout and might have been buffered... but I
    redirected the important ones to stderr, which should have
    corrected any sequencing mismatch.

    ...
    capture.c captureFrameToImage(): Entering
    capture.c captureFrameToImage(): going to fetch image next
    capture.c XGetZPixmap(): Entering
    capture.c XGetZPixmap(): read 1737904 bytes
    capture.c XGetZPixmap(): Leaving
    capture.c captureFrameToImage(): Leaving
    capture.c TCbCaptureSHM(): calling job->save
    capture.c TCbCaptureSHM(): called job->save
    capture.c TCbCaptureSHM(): pic_no=413 flags=4102 state=4
    VC_REC 4 - VC_STOP 0
    capture.c TCbCaptureSHM(): we're recording
    capture.c TCbCaptureSHM(): before remove_state @ 894
    job.c xvc_job_validate(): Locking mutex @ 607
    job.c xvc_job_validate(): after lock @ 610
    job.c xvc_job_validate(): Unlocking mutex @ 614
    job.c xvc_job_validate(): after unlock @ 617
    capture.c TCbCaptureSHM(): after remove_state @ 897
    capture.c TCbCaptureSHM(): reading an image in a data
    sturctur present
    capture.c captureFrameToImage(): Entering
    capture.c captureFrameToImage(): going to fetch image next
    capture.c XGetZPixmap(): Entering
    capture.c XGetZPixmap(): read 1737904 bytes
    capture.c XGetZPixmap(): Leaving
    capture.c captureFrameToImage(): Leaving
    capture.c TCbCaptureSHM(): calling job->save
    capture.c TCbCaptureSHM(): called job->save
    capture.c TCbCaptureSHM(): pic_no=414 flags=4102 state=4
    VC_REC 4 - VC_STOP 0
    capture.c TCbCaptureSHM(): we're recording
    capture.c TCbCaptureSHM(): before remove_state @ 894
    job.c xvc_job_validate(): Locking mutex @ 607
    job.c xvc_job_validate(): after lock @ 610
    job.c xvc_job_validate(): Unlocking mutex @ 614
    job.c xvc_job_validate(): after unlock @ 617
    capture.c TCbCaptureSHM(): after remove_state @ 897
    capture.c TCbCaptureSHM(): reading an image in a data
    sturctur present
    capture.c captureFrameToImage(): Entering
    capture.c captureFrameToImage(): going to fetch image next
    capture.c XGetZPixmap(): Entering
    capture.c XGetZPixmap(): read 1737904 bytes
    capture.c XGetZPixmap(): Leaving
    capture.c captureFrameToImage(): Leaving
    capture.c TCbCaptureSHM(): calling job->save
    capture.c TCbCaptureSHM(): called job->save
    xvidcap: pthread_mutex_lock.c:108: __pthread_mutex_lock:
    Assertion `mutex->__data.__owner == 0' failed.
    = 0x41d5d0
    target = 7
    targetCodec = 7
    ncolors = 256
    color_table = 0xb825f0
    colors = 0xb815e0
    win_attr (w/h/x/y) = 826/526/61/438
    area (w/h/x/y) = 826/526/61/438
    xtoffmpeg.c XImageToFFMPEG(): Entering
    xtoffmpeg.c dump32bit(): Entering with image 0xbaecd0
    xtoffmpeg.c dump32bit(): Leaving
    xtoffmpeg.c XImageToFFMPEG(): calling encode_video with
    codec 0xbb7d60, outbuf 0x2aaab2bed010, outbuf size
    -1296117744, output frame 0xc3a130
    xtoffmpeg.c do_video_out(): Entering with format context
    0xbb69e0 output stream 0xba9d20 buffer 0x2aaab2bed010 size 265
    xtoffmpeg.c do_video_out(): Leaving
    capture.c TCbCaptureSHM(): submitting capture of next frame
    in 71 milliseconds
    gnome_ui.c xvc_frame_monitor(): Entering with time = 29
    gnome_ui.c xvc_frame_monitor(): Leaving with percent = 30
    gnome_ui.c do_record_thread(): going for next frame
    gnome_ui.c do_record_thread(): woke up
    ...

    I'm building a non-SMP kernel in the background and will
    give it a try today or tomorrow, per your earlier request.

    -J

     
  • Logged In: YES
    user_id=782084

    What I don't get is: How come the debug output continues
    after the assertion? This can't be right, can it?

    Also, there are two mutexes (actually three, but the
    update_filename_mutex is no longer needed) and it would be
    good to make sure which one is causing the problem. One
    would be recording_mutex in gnome_ui.c, the other mp in
    xtoffmpeg.c.
    Dunno how much time I'll be able to spend on this today, but
    you might want to try building without audio support to
    avoid use of the second one (used for audio capture), so
    after configure you might want to make clean, edit config.h
    and unset HAVE_FFMPEG_AUDIO, then build.

    Also, I should be initializing recording_mutex to
    PTHREAD_MUTEX_INITIALIZER and don't atm.

     
  • Logged In: YES
    user_id=782084

    (update)
    have not gotten a single one of those on my Athlon XP2400+
    while testing many other things during the last two days.
    With those things out of the way, I'll look into this on my
    P4 HT enabled, next.
    Any news from your side?

     
  • Logged In: YES
    user_id=782084

    (update)
    haven't gotten a single crash with HAVE_FFMPEG_AUDIO
    #undef'ed, however --audio no is not sufficient. More
    research needed, but it seems the mutex for audio capture is
    the culprit. If you can live without audio, compiling
    without audio support may be a temporary workaround.

     
  • Jesse Litton
    Jesse Litton
    2006-09-09

    Logged In: YES
    user_id=210111

    >> haven't gotten a single crash with HAVE_FFMPEG_AUDIO
    #undef'ed, however --audio no is not sufficient. <<

    I too tried "--audio no", and found it did not stop the asserts.

    I rebuilt my copy with FFMPEG disabled (tried once with it
    def'd to zero, and once with it undef'd completely - binary
    says that it does not have audio support when I prompt
    params), as you suggested. But, it still seems to have the
    same problem. :(

    >> What I don't get is: How come the debug output continues
    after the assertion? This can't be right, can it? <<

    I noticed earlier that if I let all the output stream to the
    console, the assert is always last (as expected) - but when
    I redirect stdout and stderr to a file, it doesn't come out
    last. It's got to be some kind of file buffering issue.

    When running from the console, the last lines I see are:

    capture.c TCbCaptureSHM(): called job->save
    capture.c TCbCaptureSHM(): submitting capture of next frame
    in 72 milliseconds
    gnome_ui.c xvc_frame_monitor(): Entering with time = 28
    gnome_ui.c xvc_frame_monitor(): Leaving with percent = 30
    xvidcap: pthread_mutex_lock.c:108: __pthread_mutex_lock:
    Assertion `mutex->__data.__owner == 0' failed.

    I will test with my non-SMP kernel in just a little bit.
    It's been pretty crazy lately and I still haven't had the
    chance.

    -J

     
  • Logged In: YES
    user_id=782084

    (update)
    lot's of things point towards glibc/libpthread. There seem
    to be issues around NPTL. This can potentially be worked
    around by setting LD_ASSUME_KERNEL.

    I'll elaborate a little so I'll be able to find that
    information again myself:
    Linux systems typically have libraries for the different
    threading implementations lying around and the dynamic
    linker determines which ones to use based on compatibility
    information the libraries provide and the running system.
    The information the running system publishes can be tweaked.

    This is explained in more detail here:
    http://people.redhat.com/drepper/assumekernel.html
    (I'll attach it for persistence)

    So if you know what versions require what kernel, you can
    for the dll to pick a certain one. On ubuntu you cannot use
    eu-readelf, but would rather use objdump like this:

    $ LC_ALL=C objdump -s -j .note.ABI-tag /lib/libpthread.so.0

    /lib/libpthread.so.0: file format elf32-i386

    Contents of section .note.ABI-tag:
    0134 04000000 10000000 01000000 474e5500 ............GNU.
    0144 00000000 02000000 02000000 00000000 ................
    ................^........^........^

    This tells you that this library (the non-nptl version)
    requires kernel 2.2.0+

    $ LC_ALL=C objdump -s -j .note.ABI-tag /lib/tls/libpthread.so.0

    /lib/tls/libpthread.so.0: file format elf32-i386

    Contents of section .note.ABI-tag:
    0134 04000000 10000000 01000000 474e5500 ............GNU.
    0144 00000000 02000000 06000000 00000000 ................
    ................^........^........^

    This tells you that the new implementation requires kernel
    2.6.0+.

    So doing this the following sounds like a promising
    workaround (though I need to test more because I cannot
    always reproduce this):

    $ LD_ASSUME_KERNEL=2.4.19 ~/xvidcap/bin/xvidcap --mf

    If we can confirm that this helps, I'll raise the issue
    upstream.

     
1 2 3 4 > >> (Page 1 of 4)