Thread: [Mlt-devel] Future and speed of MLT
Brought to you by:
ddennedy,
lilo_booter
From: Stefan de K. <sk...@xs...> - 2007-02-28 00:00:34
|
Hi, Today we did a practical session of MLT and libdv testing with Sony equipment. MLT broke down with segmentation faults (we suspect a too old version of libdv) and when trying in with SDL the colourspace of the dv-video was wrong. (Using inigo with SDL/Xv.) The idea of MLT as broadcastframework with a broadcastserver that has a 1:1 relation with editing software is a combination I really like. But I cannot be the only one who sees that on 'normal' hardware A/V synchronization is nowhere to be found, and playing a normal DV video is causing troubles. I remember I addressed these issues in a previous email, now tested everything with accelerated Ati HW-surfaces and 'pass-through' /dev/dv1394/0. (Please change /dev/dv1394 in the CVS.) What can be considered 'issues' if I compare inigo to for example VideoLAN client (in my humble opinion the best player that is able to keep synchronization) or mplayer/ffmpeg? Why is inigo slow? Is there any profiling done? I try to be constructive in this situation, but this is not the first 'media' framework that has a great design idea but lacks the speed to be nice to use in production environment. To give a better example: currently I'm using xloadimage and vlc/mplayer for all my broadcast operations. But I prefer something that is designed like MLT. What should I do stay or leave? Yours Sincerely, Stefan de Konink |
From: Dan D. <da...@de...> - 2007-02-28 00:42:46
|
On Tuesday 27 February 2007 4:00 pm, Stefan de Konink wrote: > Today we did a practical session of MLT and libdv testing with Sony > equipment. MLT broke down with segmentation faults (we suspect a too old > version of libdv) and when trying in with SDL the colourspace of the > dv-video was wrong. (Using inigo with SDL/Xv.) I am unable to reproduce this. > The idea of MLT as broadcastframework with a broadcastserver that has a > 1:1 relation with editing software is a combination I really like. But I > cannot be the only one who sees that on 'normal' hardware A/V > synchronization is nowhere to be found, and playing a normal DV video is > causing troubles. I am unable to reproduce this with DV files as a source. A/V sync is highly dependent upon the producer plugins (combination data reader, demuxer, decoder) to do the right thing. If a producer is not specified explicitly, then the apps use fezzik, and fezzik.dict contains the rules for which producers serve which file extensions. Most things fallback to ffmpeg/libavformat. The framework carries the audio for each frame together with the image; so there should not be a problem in the framework. The SDL consumer is simple. > I remember I addressed these issues in a previous email, now tested > everything with accelerated Ati HW-surfaces and 'pass-through' > /dev/dv1394/0. (Please change /dev/dv1394 in the CVS.) Where? I do not understand to what you refer. Maybe you see something in > What can be considered 'issues' if I compare inigo to for example > VideoLAN client (in my humble opinion the best player that is able to > keep synchronization) or mplayer/ffmpeg? Why is inigo slow? Is there any > profiling done? I dunno. Do you do any profiling of it? I don't. I barely do any maintenance on it. What we developed over 6 months was sufficient in performance for the customer. The contract was not long enough to do much performance tuning--it was barely long enough to meet functional requirements and be stable. I have been mainly focused on Kino since that engagement. If you are using the SDL consumer, then you should use the consumer property rescale=none to prevent fezzik from trying to scale the producers' outputs to PAL resolution. When I did performance comparisons of inigo against ffplay, mplayer, and Xine in 2004 using that consumer option, the performance was always worse than those dedicated players, but not much. Of course, that is a general statement, and your mileage may. > I try to be constructive in this situation, but this is not the first > 'media' framework that has a great design idea but lacks the speed to be > nice to use in production environment. To give a better example: > currently I'm using xloadimage and vlc/mplayer for all my broadcast > operations. But I prefer something that is designed like MLT. What > should I do stay or leave? It sounds like you are disappointed. I have no financial or evangelical incentive to change your mind. Consider that most of MLT was developed during the first 6 months of 2004 and lightly maintained since then. The results for that are pretty good, but it also speaks for the project's activity and commitement levels. I am interested in giving MLT more attention since I am winding up Kino and wanting to assist kdenlive. However, I am not willing to make any commitments to you or this project because I already have a fairly stable job I am not interested in leaving. -- +-DRD-+ |
From: Stefan de K. <sk...@xs...> - 2007-02-28 01:10:48
|
Hi Dan, Dan Dennedy wrote: > On Tuesday 27 February 2007 4:00 pm, Stefan de Konink wrote: >> Today we did a practical session of MLT and libdv testing with Sony >> equipment. MLT broke down with segmentation faults (we suspect a too old >> version of libdv) and when trying in with SDL the colourspace of the >> dv-video was wrong. (Using inigo with SDL/Xv.) > > I am unable to reproduce this. What can I provide to reproduce it? I'm using a plain DV file, with PPC32 and Ati videocard. >> The idea of MLT as broadcastframework with a broadcastserver that has a >> 1:1 relation with editing software is a combination I really like. But I >> cannot be the only one who sees that on 'normal' hardware A/V >> synchronization is nowhere to be found, and playing a normal DV video is >> causing troubles. > > I am unable to reproduce this with DV files as a source. A/V sync is highly > dependent upon the producer plugins (combination data reader, demuxer, > decoder) to do the right thing. If a producer is not specified explicitly, > then the apps use fezzik, and fezzik.dict contains the rules for which > producers serve which file extensions. Most things fallback to > ffmpeg/libavformat. The framework carries the audio for each frame together > with the image; so there should not be a problem in the framework. The SDL > consumer is simple. Even a simple dv file with no rescale is able to get a Pentium III or Pentium IV class system to the ground using inigo and SDL. Last time I mentioned the issues I was using the proprietary nVidia drivers. Now I using the opensource Ati drivers so with accelerated SDL. >> I remember I addressed these issues in a previous email, now tested >> everything with accelerated Ati HW-surfaces and 'pass-through' >> /dev/dv1394/0. (Please change /dev/dv1394 in the CVS.) > > Where? I do not understand to what you refer. Maybe you see something in It is in the demo directory. On my udev system I don't have any references to /dev/dv1394 anymore. >> What can be considered 'issues' if I compare inigo to for example >> VideoLAN client (in my humble opinion the best player that is able to >> keep synchronization) or mplayer/ffmpeg? Why is inigo slow? Is there any >> profiling done? > > I dunno. Do you do any profiling of it? I don't. I barely do any maintenance > on it. What we developed over 6 months was sufficient in performance for the > customer. The contract was not long enough to do much performance tuning--it > was barely long enough to meet functional requirements and be stable. I have > been mainly focused on Kino since that engagement. > > If you are using the SDL consumer, then you should use the consumer property > rescale=none to prevent fezzik from trying to scale the producers' outputs to > PAL resolution. When I did performance comparisons of inigo against ffplay, > mplayer, and Xine in 2004 using that consumer option, the performance was > always worse than those dedicated players, but not much. Of course, that is a > general statement, and your mileage may. > >> I try to be constructive in this situation, but this is not the first >> 'media' framework that has a great design idea but lacks the speed to be >> nice to use in production environment. To give a better example: >> currently I'm using xloadimage and vlc/mplayer for all my broadcast >> operations. But I prefer something that is designed like MLT. What >> should I do stay or leave? > > It sounds like you are disappointed. I have no financial or evangelical > incentive to change your mind. Consider that most of MLT was developed during > the first 6 months of 2004 and lightly maintained since then. The results for > that are pretty good, but it also speaks for the project's activity and > commitement levels. I am interested in giving MLT more attention since I am > winding up Kino and wanting to assist kdenlive. However, I am not willing to > make any commitments to you or this project because I already have a fairly > stable job I am not interested in leaving. Ok, I get your point. And yes you can say I'm disappointed. But on the other hand: I didn't find any integrated open source product that fulfill all my 'wishes'. I think *all* play-out products, here meaning stuff that can do more then play a playlist and overlay a simple image, that I currently saw lack speed. And more or less stability. I am in no position to criticize the work that is done in this project, mainly because I didn't make it better (yet). But as you say; if the most work is done in 2004 what impulses does this project lack to become something in the order of Apache, Bind, etc.? To make my point clear: 1) I have no intention to reinvent the wheel, that is plain stupid. 2) I love miracle. 3) My intention is to make this work, but I have really no clue were the performance drain goes, and I'm ready to find out. 4) I'll stick here, until there is a killer app (and yes this can be it). Stefan |
From: Dan D. <da...@de...> - 2007-02-28 07:49:50
|
On Tuesday 27 February 2007 17:10, Stefan de Konink wrote: > Hi Dan, > > Dan Dennedy wrote: > > On Tuesday 27 February 2007 4:00 pm, Stefan de Konink wrote: > >> Today we did a practical session of MLT and libdv testing with Sony > >> equipment. MLT broke down with segmentation faults (we suspect a too old > >> version of libdv) and when trying in with SDL the colourspace of the > >> dv-video was wrong. (Using inigo with SDL/Xv.) > > > > I am unable to reproduce this. > > What can I provide to reproduce it? I'm using a plain DV file, with > PPC32 and Ati videocard. MLT is untested on big endian and not much attention paid to endian issues during development. :-/ And I expect codec performance on PPC to be abysmal. > >> The idea of MLT as broadcastframework with a broadcastserver that has a > >> 1:1 relation with editing software is a combination I really like. But I > >> cannot be the only one who sees that on 'normal' hardware A/V > >> synchronization is nowhere to be found, and playing a normal DV video is > >> causing troubles. > > > > I am unable to reproduce this with DV files as a source. A/V sync is > > highly dependent upon the producer plugins (combination data reader, > > demuxer, decoder) to do the right thing. If a producer is not specified > > explicitly, then the apps use fezzik, and fezzik.dict contains the rules > > for which producers serve which file extensions. Most things fallback to > > ffmpeg/libavformat. The framework carries the audio for each frame > > together with the image; so there should not be a problem in the > > framework. The SDL consumer is simple. > > Even a simple dv file with no rescale is able to get a Pentium III or > Pentium IV class system to the ground using inigo and SDL. Last time I libdv decoding requires an 1GHz PIII to decode in realtime to a display with Xv. I did some quick performance comparisons on my dual AthlonXP 2000 with an ancient Matrox G450 AGP video card. All results were based on top's %CPU: inigo with libdv: 47-53% ffplay (very recent): 40-45% mplayer with libdv: 40-45% mplayer with ffdv: 35-40% inigo with MainConcept DV: 34-37% You should use the avformat MLT producer for DV since it is faster. I could not produce numbers for inigo+avformat today because I have currently broken while my avformat producer while trying to add support for ffmpeg's new libswscale, but I would guess it to be in the 42-47% range. I wonder what is wrong with the performance on your system? Maybe MLT benefits a lot from SMP. I have a single processor 2.4Ghz P4 machine at work that has MLT with libdv and ffmpeg on it already, and I will test it tomorrow and compare the performance results. > >> I remember I addressed these issues in a previous email, now tested > >> everything with accelerated Ati HW-surfaces and 'pass-through' > >> /dev/dv1394/0. (Please change /dev/dv1394 in the CVS.) "pass-through" is not fully implemented if I understand your meaning of that term correctly. The libdv producer "attaches" the DV data to the frame object passing through the framework, but it also always decodes it. Currently, no consumer module processes the dv_data attachment for file or dv1394 output. Rather, the libdv consumer always encodes the uncompressed data on the frame to DV. As DV encoding is exremely intensive, you should expect significant performance problems going that route. The MLT sponsor, UEL, uses uncompressed SDI output. > > Where? I do not understand to what you refer. Maybe you see something in > > It is in the demo directory. On my udev system I don't have any > references to /dev/dv1394 anymore. Heh, well, the problem with dv1394 is that there is no consistency in the naming of these device files across systems especially between kernel 2.4 and 2.6 systems, but lately I do see /dev/dv1394/<N> quite commonplace on 2.6-based systems. So, I am going to change it as you suggest. > >> I try to be constructive in this situation, but this is not the first > >> 'media' framework that has a great design idea but lacks the speed to be > >> nice to use in production environment. To give a better example: > >> currently I'm using xloadimage and vlc/mplayer for all my broadcast > >> operations. But I prefer something that is designed like MLT. What > >> should I do stay or leave? > > > > It sounds like you are disappointed. I have no financial or evangelical > > incentive to change your mind. Consider that most of MLT was developed > > during the first 6 months of 2004 and lightly maintained since then. The > > results for that are pretty good, but it also speaks for the project's > > activity and commitement levels. I am interested in giving MLT more > > attention since I am winding up Kino and wanting to assist kdenlive. > > However, I am not willing to make any commitments to you or this project > > because I already have a fairly stable job I am not interested in > > leaving. > > Ok, I get your point. And yes you can say I'm disappointed. But on the > other hand: I didn't find any integrated open source product that > fulfill all my 'wishes'. I think *all* play-out products, here meaning > stuff that can do more then play a playlist and overlay a simple image, > that I currently saw lack speed. And more or less stability. > > I am in no position to criticize the work that is done in this project, > mainly because I didn't make it better (yet). But as you say; if the > most work is done in 2004 what impulses does this project lack to become > something in the order of Apache, Bind, etc.? That would require at least two fulltime developers and a cast of users providing QA feedback. We are getting the user base with kdenlive, but Jean-Baptiste and myself are currently very part time working on MLT. > To make my point clear: > 1) I have no intention to reinvent the wheel, that is plain stupid. > 2) I love miracle. > 3) My intention is to make this work, but I have really no clue were the > performance drain goes, and I'm ready to find out. > 4) I'll stick here, until there is a killer app (and yes this can be it). I too was excited in 2004 with the results that we achieved in such short time, and with BlueFish444 SDI output, it was really cool! However, I did not stay on MLT because I was still interested in supporting Kino considering the popularity it had achieved and because I have significant financial and time family obligations. Please send the %CPU range you see using an x86 system with these invocations: inigo libdv:some.dv -consumer sdl rescale=none inigo avformat:some.dv -consumer sdl rescale=none Also, indicate any SDL _VIDEO* environment variables you may have set. I found a list of SDL env vars here: ftp://ptah.lnf.kth.se/pub/misc/sdl-env-vars. I do not know what all of the video vars do, and I have not set any of them, but they could have an impact. |
From: Stefan de K. <sk...@xs...> - 2007-02-28 12:50:53
|
Dan Dennedy wrote: > MLT is untested on big endian and not much attention paid to endian issues > during development. :-/ And I expect codec performance on PPC to be abysmal. I don't blame ;) libdv is poor yes, but ffmpeg has some nice altivec optimized parts. > "pass-through" is not fully implemented if I understand your meaning of that > term correctly. The libdv producer "attaches" the DV data to the frame object > passing through the framework, but it also always decodes it. Currently, no > consumer module processes the dv_data attachment for file or dv1394 output. > Rather, the libdv consumer always encodes the uncompressed data on the frame > to DV. As DV encoding is exremely intensive, you should expect significant > performance problems going that route. The MLT sponsor, UEL, uses > uncompressed SDI output. Probably most people want a fast decompression for SDL (to tv-out) or SDI output. But I met enough people who prefer to use an external firewire based solution. Things like Canopus or a Sony videomixer. Maybe this is a feature that could come on a 'wish' list, this is the way most render engines for videoediting do, only adapt what is changed, and pass-trough the rest. In that way one could pre-render 'heavy' parts way before they are played. Since the 'cat' operation to play isn't a difficult task. > Please send the %CPU range you see using an x86 system with these invocations: > inigo libdv:some.dv -consumer sdl rescale=none real 0m39.332s user 0m9.581s sys 0m1.432s > inigo avformat:some.dv -consumer sdl rescale=none real 0m34.271s user 0m5.488s sys 0m0.764s This is measured on a AMD64 X2 5600+, when I look at fast movement, there is still a noticeable lag. To do some comparison: mplayer rechts.dv (Which uses the FFmpeg DV decoder, YV12 with Xv) real 0m33.284s user 0m3.352s sys 0m0.008s Now my main issue why I can't use mplayer for all my broadcast work is that some @#$&@(@$& (read: stupid people) render avi's in Premiere with DV video+audio + newaudio. And mplayer has problems to keep it in sync over long times. Sometimes it even selects the wrong audio channels to play. > Also, indicate any SDL _VIDEO* environment variables you may have set. I found > a list of SDL env vars here: ftp://ptah.lnf.kth.se/pub/misc/sdl-env-vars. > I do not know what all of the video vars do, and I have not set any of them, > but they could have an impact. I did not set any with the test. I should be able to do a test with mplayer using SDL and see the impact. But I didn't compile mplayer with SDL support yet... Thanks for your responses so far. I know I can be a real pain or maybe sound arrogant. Stefan |
From: Stephane F. <f8...@fr...> - 2007-02-28 18:00:30
|
Tue, Feb 27, 2007 at 11:49:39PM -0800, Dan Dennedy skribis: > On Tuesday 27 February 2007 17:10, Stefan de Konink wrote: > > Dan Dennedy wrote: > > > On Tuesday 27 February 2007 4:00 pm, Stefan de Konink wrote: > > >> Today we did a practical session of MLT and libdv testing with Sony > > >> equipment. MLT broke down with segmentation faults (we suspect a too old > > >> version of libdv) and when trying in with SDL the colourspace of the > > >> dv-video was wrong. (Using inigo with SDL/Xv.) Have you tried with an updated libdv, is the problem in MLT or elsewhere? Could the colourspace problem be an andian problem with rgb/bgr or yuv/vuy handling bug? I've seen some MLT filter code which is not endian clean wrt pixel acces. > > What can I provide to reproduce it? I'm using a plain DV file, with > > PPC32 and Ati videocard. > > MLT is untested on big endian and not much attention paid to endian issues > during development. :-/ And I expect codec performance on PPC to be abysmal. Let's assume the libs MLT is depending on are endian clean. I think you should be able to check whether MLT is endian clean with this test: inigo libdv:somefile.dv -consumer libdv:recoded1.dv rescale=none then compare using a tool known to work (eg. xine, mplayer, etc.) both files somefile.dv and recoded1.dv. You can do the same with avformat (FFmpeg) libraries: inigo avformat:somefile.dv -consumer avformat:recoded2.dv rescale=none Does anyone know a free tool to compute PSNR between two video files? A non-interactive program could let us do somewhat automatic testing. QA painlessly.. > > Even a simple dv file with no rescale is able to get a Pentium III or > > Pentium IV class system to the ground using inigo and SDL. Last time I > > libdv decoding requires an 1GHz PIII to decode in realtime to a display with > Xv. I did some quick performance comparisons on my dual AthlonXP 2000 with an > ancient Matrox G450 AGP video card. All results were based on top's %CPU: > > inigo with libdv: 47-53% > ffplay (very recent): 40-45% > mplayer with libdv: 40-45% > mplayer with ffdv: 35-40% > inigo with MainConcept DV: 34-37% IMO, Stefan is right, measures with 'time' are better than % of CPU because of frequency scaling. Here's my input on a laptop with Pentium-M: libdv real 0m19.608s user 0m10.057s sys 0m0.244s avformat real 0m19.094s user 0m9.449s sys 0m0.432s mplayer real 0m18.374s user 0m6.680s sys 0m0.132s Tests were run 2 times so files/libs get in page cache the first time. > > It is in the demo directory. On my udev system I don't have any > > references to /dev/dv1394 anymore. > > Heh, well, the problem with dv1394 is that there is no consistency in the > naming of these device files across systems especially between kernel 2.4 and > 2.6 systems, but lately I do see /dev/dv1394/<N> quite commonplace on > 2.6-based systems. So, I am going to change it as you suggest. I don't have dv1394 hardware, but could the dv1394 device name be passed as an option parameter (property?) to the MLT framework? > > >> I try to be constructive in this situation, but this is not the first > > >> 'media' framework that has a great design idea but lacks the speed to be > > >> nice to use in production environment. So if it's broken, let's fix it! Talking about speed, we have this special gift on Linux which is named "OProfile". This is a system wide, statistical profiler. Have a look at http://oprofile.sf.net For apt-get junkies, this is a matter of: - sudo apt-get install oprofile linux-debug - compile all your code (MLT, ..) with debug info (-g). This is not necessary, but you'll get source code line info with it! - sudo opcontrol --init - inigo avformat:pinguin.dv -consumer sdl rescale=none - sudo opcontrol --start - wait for end of inigo - sudo opcontrol --stop - opreport -lg CPU: PIII, speed 1733 MHz (estimated) Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit mask) count 100000 warning: some functions compiled without debug information may have incorrect source line attributions samples % linenr info app name symbol name 17643 18.0900 dv.c:293 libavcodec.so.51.34.0 dv_decode_ac 9627 9.8709 simple_idct_mmx.c:1286 libavcodec.so.51.34.0 ff_simple_idct_put_mmx 9207 9.4403 (no location information) libc-2.4.so memcpy 8254 8.4631 filter_rescale.c:113 libmltcore.so scale_alpha 7871 8.0704 imgconvert.c:1114 libavcodec.so.51.34.0 yuv420p_to_yuyv422 6835 7.0082 dv.c:997 libavcodec.so.51.34.0 dv_decode_mt 4989 5.1154 (no location information) vmlinux-dbg-2.6.17-11-generic delay_pmtmr 3865 3.9629 simple_idct.c:469 libavcodec.so.51.34.0 simple_idct248_put 3687 3.7804 (no location information) radeon (no symbols) 2152 2.2065 (no location information) libc-2.4.so memset 1861 1.9082 (no location information) vmlinux-dbg-2.6.17-11-generic rtc_cmos_read 1844 1.8907 resample2.c:189 libavcodec.so.51.34.0 av_resample 1651 1.6928 deinterlace.c:779 libmltxine.so deinterlace_yuv 1543 1.5821 (no location information) libqt-mt.so.3.3.6 (no symbols) 1533 1.5718 (no location information) vmlinux-dbg-2.6.17-11-generic __copy_to_user_ll 1356 1.3904 dsputil_mmx.c:267 libavcodec.so.51.34.0 put_pixels_clamped_mmx 693 0.7106 dv.c:311 libavformat.so.51.10.0 dv_produce_packet 495 0.5075 (no location information) Xorg (no symbols) 489 0.5014 mlt_properties.c:692 libmlt.so.0.2.2 mlt_properties_get_data 472 0.4840 (no location information) processor (no symbols) 445 0.4563 (no location information) libasound.so.2.0.0 (no symbols) 430 0.4409 (no location information) libfb.so fbCompositeSolidMask_nx8x8888mmx 309 0.3168 (no location information) oprofiled (no symbols) 265 0.2717 (no location information) libc-2.4.so strcmp 250 0.2563 (no location information) libc-2.4.so memmove 235 0.2410 mlt_properties.c:568 libmlt.so.0.2.2 mlt_properties_get_int 206 0.2112 mlt_properties.c:577 libmlt.so.0.2.2 mlt_properties_set_int And there it is! We're spending a bunch of time in libavcodec:dv_decode_ac() and also in ff_simple_idct_put_mmx(). This would be the first places to optimize. The ff_simple_idct_put looks already optimized for MMX though. 9.3% of the samples (ie. hardware performance counters, which is not exactly CPU cycles) were found in memcpy. That would be interesting to see who is calling memcpy, and maybe optimize the GLibc memcpy call with an alternate mempcy which would be cache aware (read prefetch, cache preallocate on write to save the painful read-on-write). This can be identified with a callgraph from OProfile. I'm pretty sure memset could be optimized too (eg. dcbz under PPC32). Interesting, scale_alpha shows up with 8%. Let's take a look closer: - opannotate -s /usr/local/share/mlt/modules/libmltcore.so Samples % :static void scale_alpha( mlt_frame this, int iwidth, int iheight, int owidth, int oheight ) 2 0.0241 :{ /* scale_alpha total: 8254 99.3979 */ : uint8_t *output = NULL; : uint8_t *input = mlt_frame_get_alpha_mask( this ); : : if ( input != NULL ) : { : uint8_t *out_line; : int x, y; : int ox = ( iwidth << 10 ) / owidth; : int oy = ( iheight << 10 ) / oheight; : : output = mlt_pool_alloc( owidth * oheight ); : out_line = output; : : // Loop for the entirety of our output height. 41 0.4937 : for ( y = 0; y < oheight; y ++ ) 2009 24.1932 : for ( x = 0; x < owidth; x ++ ) 6202 74.6869 : *out_line ++ = *( input + ( ( 512 + ( y * oy * iwidth ) + x * ox ) >> 10 ) ); : : // Set it back on the frame : mlt_properties_set_data( MLT_FRAME_PROPERTIES( this ), "alpha", output, owidth * oheight, mlt_pool_release, NULL ); : } :} The loop could certainly be unrolled a little, along with some hoisting. yuv420p_to_yuyv422 of ffmpeg is a good candidate for MMX rewrite. Perhaps, delay_pmtmr is a function called by radeon driver (for time sync) or a PM thing? I don't understand why rtc_cmos_read is so high. libqt-mt.so.3.3.6 calls might be simply SDL and the GUI (KDE) I'm using. Note: other CPU/archs may show different bottlenecks.. Well, now we know how to find where to make things faster :-) I would strongly recommand liboil for implementing the alternate speedups. http://liboil.freedesktop.org/wiki/ -- Stephane |
From: Dan D. <da...@de...> - 2007-02-28 18:33:35
|
On Wednesday 28 February 2007 10:00 am, Stephane Fillod wrote: > Tue, Feb 27, 2007 at 11:49:39PM -0800, Dan Dennedy skribis: > > Heh, well, the problem with dv1394 is that there is no consistency in the > > naming of these device files across systems especially between kernel 2.4 > > and 2.6 systems, but lately I do see /dev/dv1394/<N> quite commonplace on > > 2.6-based systems. So, I am going to change it as you suggest. > > I don't have dv1394 hardware, but could the dv1394 device name be passed as > an option parameter (property?) to the MLT framework? It is already as the filename. The file Stefan pointed was just a demo/example script. > And there it is! We're spending a bunch of time in > libavcodec:dv_decode_ac() and also in ff_simple_idct_put_mmx(). This would > be the first places to optimize. The ff_simple_idct_put looks already > optimized for MMX though. I expect those to be the heaviest operations, and they are already somewhat optimized, I assume. > 9.3% of the samples (ie. hardware performance counters, which is not > exactly CPU cycles) were found in memcpy. That would be interesting to > see who is calling memcpy, and maybe optimize the GLibc memcpy call with > an alternate mempcy which would be cache aware (read prefetch, cache sounds promising > Interesting, scale_alpha shows up with 8%. Let's take a look closer: [...] > The loop could certainly be unrolled a little, along with some hoisting. Eh?! I do not even believe scale_alpha should even be called. I will look into this issue. > yuv420p_to_yuyv422 of ffmpeg is a good candidate for MMX rewrite. true, but we should avoid this sort of silly conversion when it's not necessary. There is no processing and the SDL consumer could be setup to handle i420/yv12. > Well, now we know how to find where to make things faster :-) > I would strongly recommand liboil for implementing the alternate speedups. > http://liboil.freedesktop.org/wiki/ I have heard of it, but do you know how it compares to ffmpeg/libswscale for colorspace conversion? We have some C colorspace code that is currently used a lot, but we also have filter_avcolour_space that I have adapted to support swscale in my working copy, and I could try to integrate some more automatic, just-in-time colorspace conversion into the framework that can be assigned to a particular filter. -- +-DRD-+ |
From: Dan D. <da...@de...> - 2007-02-28 17:52:29
|
On Tuesday 27 February 2007 11:49 pm, Dan Dennedy wrote: > with Xv. I did some quick performance comparisons on my dual AthlonXP 2000 > with an ancient Matrox G450 AGP video card. All results were based on top's > %CPU: > > inigo with libdv: 47-53% > ffplay (very recent): 40-45% > mplayer with libdv: 40-45% > mplayer with ffdv: 35-40% > inigo with MainConcept DV: 34-37% > > You should use the avformat MLT producer for DV since it is faster. I could > not produce numbers for inigo+avformat today because I have currently > broken while my avformat producer while trying to add support for ffmpeg's > new libswscale, but I would guess it to be in the 42-47% range. > > I wonder what is wrong with the performance on your system? Maybe MLT > benefits a lot from SMP. I have a single processor 2.4Ghz P4 machine at > work that has MLT with libdv and ffmpeg on it already, and I will test it > tomorrow and compare the performance results. I repaired avformat producer on my system last night, and it fared the same as libdv. Probably some improvements can be made in the avformat producer. I know it opens the file separately for audio and video to fit in with the mlt architecture, and that is probably some to blame as it is reading each frame seeking within file and reading each frame from disk twice. I just tested on my machine at work (2.8GHz single P4) with no hyperthreading: inigo with libdv: ~35% inigo with avformat: ~28% ffplay: 13.5% mplayer with libdv: ~28% mplayer with ffdv: 28-30% Interesting to note here is that my machine was designed to be a server, and it has an ATI adapter on PCI (blech!). Therefore, Xorg is using around 50%! -- +-DRD-+ |
From: Dan D. <da...@de...> - 2007-02-28 18:49:17
|
On Wednesday 28 February 2007 10:01 am, Stefan de Konink wrote: > Dan Dennedy wrote: > > ffplay: 13.5% > > > :o Is that a typo?! > > If not, something is very improved since the last update. $ time ffplay timecode.dv (a simple talking head with timecode burned into it) real 0m23.786s user 0m2.878s sys 0m0.220s $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 1 cpu MHz : 2800.343 cache size : 1024 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx pni monitor ds_cpl cid xtpr bogomips : 5603.87 -- +-DRD-+ |
From: Dan D. <da...@de...> - 2007-02-28 18:11:55
|
On Wednesday 28 February 2007 4:50 am, Stefan de Konink wrote: > Dan Dennedy wrote: > > MLT is untested on big endian and not much attention paid to endian > > issues during development. :-/ And I expect codec performance on PPC to > > be abysmal. > > I don't blame ;) libdv is poor yes, but ffmpeg has some nice altivec > optimized parts. > > > "pass-through" is not fully implemented if I understand your meaning of > > that term correctly. The libdv producer "attaches" the DV data to the > > frame object passing through the framework, but it also always decodes > > it. Currently, no consumer module processes the dv_data attachment for > > file or dv1394 output. Rather, the libdv consumer always encodes the > > uncompressed data on the frame to DV. As DV encoding is exremely > > intensive, you should expect significant performance problems going that > > route. The MLT sponsor, UEL, uses uncompressed SDI output. > > Probably most people want a fast decompression for SDL (to tv-out) or > SDI output. But I met enough people who prefer to use an external > firewire based solution. Things like Canopus or a Sony videomixer. Maybe > this is a feature that could come on a 'wish' list, this is the way most > render engines for videoediting do, only adapt what is changed, and > pass-trough the rest. In that way one could pre-render 'heavy' parts way > before they are played. Since the 'cat' operation to play isn't a > difficult task. > > > Please send the %CPU range you see using an x86 system with these > > invocations: inigo libdv:some.dv -consumer sdl rescale=none > > real 0m39.332s > user 0m9.581s > sys 0m1.432s 28% > > inigo avformat:some.dv -consumer sdl rescale=none > > real 0m34.271s > user 0m5.488s > sys 0m0.764s 18% The real times of the two above tests are very different. When you run these tests, how are you telling inigo to exit at the end? If you have to manually quit, then it might skew the results. > This is measured on a AMD64 X2 5600+, when I look at fast movement, > there is still a noticeable lag. you think frame is getting dropped or the playback is slowing? > To do some comparison: > > mplayer rechts.dv (Which uses the FFmpeg DV decoder, YV12 with Xv) > > real 0m33.284s > user 0m3.352s > sys 0m0.008s 10% I lack the capacity to make MLT peform better than the other media players in the near-to-mid term. -- +-DRD-+ |
From: Dan D. <da...@de...> - 2007-02-28 19:03:19
|
On Wednesday 28 February 2007 10:20 am, Stefan de Konink wrote: > Dan Dennedy wrote: > > I lack the capacity to make MLT perform better than the other media > > players in the near-to-mid term. > > First, you are not alone :) Second, maybe we could find some tricks in > the other players to speed up the play part of MLT. And another option > is to use MLT as the 'generator' and playing the content in for example > ffplay/mplayer. While I am pessimistic about improving over these players, with the assistance provided by Stephane Fillod (even if just informative for now), I am more optimistic about being able to make improvements. However, before I go down this rabbit hole, I really need time to review MLT code, document it more during that process, and first look to addressing some code ugliness (hardcoding, assumptions, etc.). I also want to integrate some unit tests. -- +-DRD-+ P.S. I updated my old kdenlive checkout last night and took it out for a test drive. It is very impressive, which is also helping to restore some enthusiasm and hope! |
From: Zachary D. <zac...@gm...> - 2007-02-28 19:16:55
|
On 2/28/07, Dan Dennedy <da...@de...> wrote: > > On Wednesday 28 February 2007 10:20 am, Stefan de Konink wrote: > > Dan Dennedy wrote: > > > I lack the capacity to make MLT perform better than the other media > > > players in the near-to-mid term. > > > > First, you are not alone :) Second, maybe we could find some tricks in > > the other players to speed up the play part of MLT. And another option > > is to use MLT as the 'generator' and playing the content in for example > > ffplay/mplayer. > > While I am pessimistic about improving over these players, with the > assistance > provided by Stephane Fillod (even if just informative for now), I am more > optimistic about being able to make improvements. However, before I go > down > this rabbit hole, I really need time to review MLT code, document it more > during that process, and first look to addressing some code ugliness > (hardcoding, assumptions, etc.). I also want to integrate some unit tests. If it's not out of your way, could you look into making MLT more general purpose in regards to resolution and frame rate? -- > +-DRD-+ > > P.S. I updated my old kdenlive checkout last night and took it out for a > test > drive. It is very impressive, which is also helping to restore some > enthusiasm and hope! Kdenlive may be enough to bring me back in the fold into the coming months too. -Zach |