From: Christer P. <pa...@no...> - 2001-12-23 02:54:36
|
Hello! Has anyone else experienced problems with X hanging after a while when seeking? I get this really weird problem when I drag the seek "knob" back and forth for a while (sometimes it actually happens right after xine is started, before I even had a chance to start the playback). The symptom is that the control window, and most of the time also the video and audio output, locks up tight and stops responding to any events. The only way out is a kill -9. What's even more annoying is that if I keep xine "running" in that state for a while, X locks up to the point where mouse and keyboard doesn't respond (not even ctrl-alt-backspace), so I'm dunno in how to debug this properly... Killing xine shortly after the problem appears seems to save me from further trouble. If I dump a stack trace before it all falls down on me, I can see the "main" thread sitting inside of select() in XNextEvent(), and the other threads usually sleeping. Although the core problem is obviously with X or the kernel, I only see it with xine, and there it is extremely easy to reproduce. I'm using an audio-only mp3 file to experiment with this and the output drivers are XShm/OSS, so there isn't anything fancy going on. For what it's worth, it seems like xine <= 0.9.7 never causes this on startup (only when seeking), while the current CVS version hangs on startup fairly often. Can anyone think of anything that happens when xine is (re)initialized on startup or as a consequence of a seek, that might cause this? I'm on Red Hat 7.2 with kernel 2.4.9-13 and XFree86 4.1.0-3 on an i386 (IBM ThinkPad 600E). -- Christer Palm |
From: Harm v. d. H. <ha...@et...> - 2001-12-23 03:07:48
|
On Sun, Dec 23, 2001 at 03:59:42AM +0100, Christer Palm wrote: > Hello! > Has anyone else experienced problems with X hanging after a while when > seeking? Yes, I can reliably hang the PC (or, at least, freeze display and input which amounts to the same thing since I don't have a terminal for network login) by dragging the slider in xine back and forth during a few seconds while playing a divx. I figured it was a divx4linux 'feature'. Guess I was wrong. I'm pretty sure this was introduced when seek actions started to reset the decoder plugin(s). Perhaps there's a problem there when those decoder close/init commands are called too quicly? Regards, Harm |
From: Christer P. <pa...@no...> - 2001-12-24 04:19:26
|
Christer Palm wrote: > Hello! > Has anyone else experienced problems with X hanging after a while when > seeking? > I have now debugged this for (quite) a while, and it appears that there are several conditions under which xine can hang when pounding it with seek requests. The situation that occurs most often in my test case (just playing a simple mp3 file), however, seems to be a race condition related to the audio output thread. 1) When fiddling with the seek "knob", xine_play() is called repedeately. 2) xine_play() shuts down the whole thing by calling the demuxers stop() method. The demuxer stop() method immediately shuts down the demuxer thread, and then posts an asyncronous request to the decoder to stop its output driver. This is done by allocating a buffer off the decoder buffer pool, fill it in with a BUF_CONTROL_END request and then post it on the decoders fifo. If there are no buffers available in the decoders buffer pool, the demuxer blocks until one is available. Otherwise, the stop() method immediately returns control to xine_play() without waiting for the output drivers to actually shut down. 3) xine_play() now restarts the demuxer by calling its start() method. Eventually, this will cause the decoder plugin to re-open the output driver by calling its open() method. 4) The output driver open() method checks to see if the output driver thread is already running. If it is, it just prints a warning message and returns. Now, sometimes, step 4 is reached before the output driver has actually shut down. So the output driver open() method will just return thinking that the output driver thread is already running. This is evident from "audio_out: pthread already running!" messages on stdout. This causes audio to disappear. However, it also causes a domino effect on the whole thing; Since there is now no driver thread, there is no one to pick up the decoded frames from the output fifo. This will eventually cause the output fifo to fill up, which appears to cause the decoder to block. If the decoder blocks, there is no one to pick up stuff on the decoder fifo, so eventually that will also fill up. This will probably cause the demuxer to block, but even worse, it will also cause xine_play() to block in the demuxer stop() method. Since xine_play() is called from the GUI event handler, this will cause the GUI to hang! My problem right now is how to fix this properly... IMHO, this whole thing is a big mess that perhaps should be cleaned up a bit to avoid similar things from happening? My feeling is that this definitely is a 1.0 showstopper. Perhaps someone with a better understanding of the internal architecture could have a look at this? -- Christer Palm |
From: Miguel F. <mi...@ce...> - 2001-12-24 11:36:30
|
Hi Christer, Christer Palm wrote: > > I have now debugged this for (quite) a while, and it appears that there > are several conditions under which xine can hang when pounding it with > seek requests. > > The situation that occurs most often in my test case (just playing a > simple mp3 file), however, seems to be a race condition related to the > audio output thread. > > 1) When fiddling with the seek "knob", xine_play() is called repedeately. > > 2) xine_play() shuts down the whole thing by calling the demuxers stop() > method. The demuxer stop() method immediately shuts down the demuxer > thread, and then posts an asyncronous request to the decoder to stop its > output driver. This is done by allocating a buffer off the decoder > buffer pool, fill it in with a BUF_CONTROL_END request and then post it > on the decoders fifo. > > If there are no buffers available in the decoders buffer pool, the > demuxer blocks until one is available. Otherwise, the stop() method > immediately returns control to xine_play() without waiting for the > output drivers to actually shut down. > > 3) xine_play() now restarts the demuxer by calling its start() method. > Eventually, this will cause the decoder plugin to re-open the output > driver by calling its open() method. > > 4) The output driver open() method checks to see if the output driver > thread is already running. If it is, it just prints a warning message > and returns. > > > Now, sometimes, step 4 is reached before the output driver has actually > shut down. So the output driver open() method will just return thinking > that the output driver thread is already running. This is evident from > "audio_out: pthread already running!" messages on stdout. This is surelly a bug. I fixed it before and added this "warning" message just in case if it ever returns... :) The situation isn't as bad as you say. The demux -> decoder comunication is done throught a fifo, so there's no way of a BUF_CONTROL_START can be processed before a BUF_CONTROL_END. The bug you seem to have found must be that audio_out->close haven't finished the output thread by some reason. > This causes audio to disappear. However, it also causes a domino effect > on the whole thing; > > Since there is now no driver thread, there is no one to pick up the > decoded frames from the output fifo. This will eventually cause the > output fifo to fill up, which appears to cause the decoder to block. > > If the decoder blocks, there is no one to pick up stuff on the decoder > fifo, so eventually that will also fill up. This will probably cause the > demuxer to block, but even worse, it will also cause xine_play() to > block in the demuxer stop() method. Since xine_play() is called from the > GUI event handler, this will cause the GUI to hang! > > > My problem right now is how to fix this properly... IMHO, this whole > thing is a big mess that perhaps should be cleaned up a bit to avoid > similar things from happening? This kind of domino effect will only happen if some output driver blocks or if a decoder hangs. I've never seen the former (except due to bugs like this) and the second should only be caused by really bad streams. Of course we should try to fix both cases, but it doesn't mean we have to change the architecture... > My feeling is that this definitely is a 1.0 showstopper. > Perhaps someone with a better understanding of the internal architecture > could have a look at this? I will try to reproduce this. As i said, i fixed this same problem with audio and video streams some time ago. I have never tested much with mp3 files though... regards, Miguel PS: Thanks for the detailed report! :) |
From: Miguel F. <mi...@ce...> - 2001-12-24 12:20:51
|
Miguel Freitas wrote: >> My feeling is that this definitely is a 1.0 showstopper. >> Perhaps someone with a better understanding of the internal >> architecture could have a look at this? > > I will try to reproduce this. As i said, i fixed this same problem with > audio and video streams some time ago. I have never tested much with mp3 > files though... It should be fixed on cvs, give it a try... bye and have a merry xmas! Miguel |
From: Christer P. <pa...@no...> - 2001-12-24 17:10:11
|
Hi Miguel! Miguel Freitas wrote: > > Christer Palm wrote: >> >> Now, sometimes, step 4 is reached before the output driver has >> actually shut down. So the output driver open() method will just >> return thinking that the output driver thread is already running. This >> is evident from "audio_out: pthread already running!" messages on stdout. > > This is surelly a bug. I fixed it before and added this "warning" > message just in case if it ever returns... :) > He he, OK. > The situation isn't as bad as you say. The demux -> decoder comunication > is done throught a fifo, so there's no way of a BUF_CONTROL_START can be > processed before a BUF_CONTROL_END. The bug you seem to have found must > be that audio_out->close haven't finished the output thread by some reason. Right, this particular problem is not in the demuxer->decoder communication, but in the decoder->output plugin communication [i.e. ao_close() and the audio_loop_running variable]. > Of course we should try to fix both cases, but it doesn't mean we have > to change the architecture... No, I wasn't thinking about changing the general architecture. But the way the control flow is currently implemented [i.e. scattered throughout the code] is, at least to me, very messy, and is like begging for problems like this. Who controls what?, when?, in what way?, through what mechanisms?, synchronous or asyncronous?, and who manages the dependencies?... To fix that, one needs to have a very good understanding about the details of how things are supposed to work. > I will try to reproduce this. As i said, i fixed this same problem with > audio and video streams some time ago. I have never tested much with mp3 > files though... I see the same problems when seeking in video. Just thought that using an audio-only file would make debugging more easy. > It should be fixed on cvs, give it a try... Cool, I will check it out tonight. > bye and have a merry xmas! Feliz Natal! -- Christer Palm |
From: Christer P. <pa...@no...> - 2001-12-25 05:00:46
|
Christer Palm wrote: > Miguel Freitas wrote: > > > > It should be fixed on cvs, give it a try... > > Cool, I will check it out tonight. > OK, checked it out, and looks much better now. I still see two hang situations when seeking. One is definitely X related, looks like some kind of XLockDisplay() deadlock and is possibly a bug in XFree86. I'll look further into that later on... The other one, however, seems to be related to the bug Miguel just squashed, it might possibly even be the same bug in a different disguise. At this time, however, I don't really understand exactly why it happens. The symptom is that audio suddenly disappears. This time, the GUI is alive and I can start/stop/seek, but nothing happens. If I play/seek the position know just jumps back to the position where the problem occured. In the debugger, I can see that the audio decoder sits there waiting on its fifo, there are no audio or video driver threads running, and the GUI threads are idle. The video decoder, however, is blocked inside metronom_video_stream_end() waiting for an audio_ended condition, which never appears to occur: if (this->have_audio) { /* while (this->audio_stream_running) { */ if (this->audio_stream_running) { printf ("metronom: waiting for audio to end...\n"); pthread_cond_wait (&this->audio_ended, &this->lock); <<<<<< HERE } } However, this just appears to be a consequence of something else happening earlier on. This is the last few lines of console output when the problem occurs: --------------------------------------------------- metronom: audio stream end metronom: Signalling audio_ended metronom: video stream start... metronom: waiting for audio to start... metronom: audio stream start... metronom: video stream end metronom: waiting for audio to end... metronom: audio stream start...done metronom: audio stream end metronom: Signalling audio_ended metronom: video stream start... metronom: waiting for audio to start... metronom: audio stream start... metronom: audio stream start...done audio_loop: using decoder >mad< libmad: audio sample rate 22050 mode 00000008 audio_oss_out: ao_open rate=22050, mode=8 audio_oss_out: audio rate : 22050 requested, 22050 provided by device/sec audio_oss_out : 2 channels output audio_out: output sample rate 22050 audio_out: thread created audio_out: inserting 7275 0-frames to fill a gap of 29800 pts libmad: audio sample rate 44100 mode 00000008 audio_out: stopping thread... audio_out: thread stopped, closing driver >>>> Here problem has occured, no audio anymore. I press stop. >>>> Seek knob returns to home position. xine_stop xine_stop: stopping demuxer metronom: video stream end metronom: waiting for audio to end... xine_stop: done >>>> Now I press play again. Seek knob now returns to last position and >>>> still no audio. xine-panel: PLAY xine_play: xine open /tmp/test.mp3, start pos = 0, start time = 0 (sec) xine: using input plugin >file< for this MRL (/tmp/test.mp3). xine: using demuxer plugin >MPGAUDIO< for this MRL. mpgaudio: bitrate = 61.01kbps ------------------------------------------------------------- As you can see just before the problem occurs, events are heavily out-of-sequence as a result of the asyncronous nature of much of this stuff. However, the problem seems to occur as the audio_out driver suddenly decides to shut down again right after it was started. Apparently, metronome thinks it's running though, because the next step, where I press stop, is where metronome blocks waiting for audio_ended. This, of course, effectively disables control over most of the pipeline as evident from the output when I press play again. This is where it stands as of now. I will continue too look at this later, unless someone else has a good clue. -- Christer Palm |
From: Miguel F. <mi...@ce...> - 2001-12-25 11:46:51
|
Christer Palm wrote: > audio_loop: using decoder >mad< > libmad: audio sample rate 22050 mode 00000008 > audio_oss_out: ao_open rate=22050, mode=8 > audio_oss_out: audio rate : 22050 requested, 22050 provided by device/sec > audio_oss_out : 2 channels output > audio_out: output sample rate 22050 > audio_out: thread created > audio_out: inserting 7275 0-frames to fill a gap of 29800 pts > libmad: audio sample rate 44100 mode 00000008 > audio_out: stopping thread... > audio_out: thread stopped, closing driver > > >>>> Here problem has occured, no audio anymore. I press stop. > >>>> Seek knob returns to home position. > > (...) > As you can see just before the problem occurs, events are heavily > out-of-sequence as a result of the asyncronous nature of much of this > stuff. However, the problem seems to occur as the audio_out driver > suddenly decides to shut down again right after it was started. Try again, i commited stuff that might fix this! regards, Miguel PS: Do you speak portuguese?? ;) |
From: Christer P. <pa...@no...> - 2001-12-25 20:35:20
|
Miguel Freitas wrote: > Christer Palm wrote: >> (...) As you can see just before the problem occurs, events are >> heavily out-of-sequence as a result of the asyncronous nature of much >> of this stuff. However, the problem seems to occur as the audio_out >> driver suddenly decides to shut down again right after it was started. > > Try again, i commited stuff that might fix this! > OK, super! Can't reproduce with the mp3 file now. So I switched to my video files... :-) This is what happens now (where's the demuxer??) [tested on default CVS branch]: Thread 1: pthread_cond_wait buffer_pool_alloc [buffer.c:92] file_plugin_read_block [input_file.c:314] parse_mpeg1_packet [demux_mpeg.c:363] parse_pack_preview [demux_mpeg.c:598] demux_mpeg_start [demux_mpeg.c:738] xine_play [xine.c:350] gui_set_current_position [actions.c:311] ... gui_run [event.c:1080] main [main.c:703] Thread 2: (pthread manager) Thread 3: (slider_loop, sleeping) Thread 4: (metronom, sleeping) Thread 5: pthread_cond_wait vo_get_frame [video_out.c:98] mpeg2_decode_data [decode.c:236] mpeg2dec_decode_data [xine_decoder.c:74] video_decoder_loop [video_decoder.c:235] Thread 6: pthread_cond_wait metronom_audio_stream_end [metronom.c:366] audio_decoder_loop [audio_decoder.c:90] Thread 7: pause video_out_loop [video_out.c:216] pthread_start_thread [manager.c:284] Thread 8: pthread_cond_wait ao_loop [audio_out.c:142] Releveant snip from console output: xine: using input plugin >file< for this MRL demux_avi: AVI_init failed (AVI_errno: 9) xine: using demuxer plugin >MPEG< for this MRL. metronom: audio stream start... metronom: waiting for video to start... metronom: video stream start... metronom: audio stream start...done audio_loop: using decoder >mad< video_out : thread created video_decoder: using decoder >mpeg2dec< mpeg2dec: frame size 352 x 288 video_decoder: discontinuity ahead metronom: video discontinuity #1 metronom: waiting for audio discontinuity #1 audio_decoder: discontinuity ahead metronom: audio discontinuity #1 video_decoder: discontinuity ahead metronom: video discontinuity #2 metronom: waiting for audio discontinuity #2 audio_decoder: discontinuity ahead metronom: audio discontinuity #2 libmad: audio sample rate 44100 mode 00000008 audio_oss_out: ao_open rate=44100, mode=8 audio_oss_out: audio rate : 44100 requested, 44100 provided by device/sec audio_oss_out : 2 channels output audio_out: output sample rate 44100 audio_out: thread created metronom: audio pts discontinuity/start, pts is 154737139, wrap_offset is -154707139, vpts is 30000 audio_out: inserting 11518 0-frames to fill a gap of 23543 pts xine_play: xine open start pos = 33725, start time = 0 (sec) metronom: video pts discontinuity/start, pts is 154755360, wrap_offset is -154710360, vpts is 45000 libmpeg2: blasting out backward reference frame on flush demux_mpeg: stop... xine: using input plugin >file< for this MRL demux_avi: AVI_init failed (AVI_errno: 9) xine: using demuxer plugin >MPEG< for this MRL. video_out : throwing away image with pts 80322 because it's too old (diff : 2635733 > 1800). metronom: audio stream end metronom: waiting for video to end... xine_play: xine open start pos = 35066, start time = 0 (sec) demux_mpeg: stop... xine: using input plugin >file< for this MRL demux_avi: AVI_init failed (AVI_errno: 9) xine: using demuxer plugin >MPEG< for this MRL. xine_play: xine open start pos = 36599, start time = 0 (sec) demux_mpeg: stop... demux loop finished (status: 1, buf:ffff602e) xine: using input plugin >file< for this MRL demux_avi: AVI_init failed (AVI_errno: 9) xine: using demuxer plugin >MPEG< for this MRL. xine_play: xine open start pos = 36791, start time = 0 (sec) demux_mpeg: stop... Perhaps this is related to Guenters latest updates? Btw, anybody knows if "Neverending Story" is available on DVD? :-) -- Christer Palm |