Re: [Mlt-devel] opengl support

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Wed, Jan 4, 2012 at 11:29 AM, Christophe Thommeret <hf...@fr...> wrote:
> Le jeudi 22 décembre 2011 07:07:13, Dan Dennedy a écrit :
>> > Yes, openCL has some interesting features. Bt at that moment, my first
>> > experiments are quite disapointing.
>> > First, it's really difficult to optimize for speed. It really requires
>> > intimate knowledge of the targeted hardware to get the best from it.
>> > I've written a bicubic scaling filter for both openGL and openCL, and
>> > after several hours of reading, adjusting and testing using all
>> > available nvidia's gpu documentation (mostly found in cuda), the openCL
>> > version is still far from openGL performances. To upscale a 720x576
>> > image to 1920x1080, openCL is about 3x slower : 323 fps vs 1150.
>>
>> Interesting findings, and maybe this is due to immaturity of OpenCL
>> implementations. Thank you for the update. GLSL is interesting too. I
>> believe both avenues may face inconsistent levels of support across
>> vendors and chips.
>
> Yes, OpenCL implementations are probably immature. The nvidia's one is more or
> less a cuda wrapper, and i've discovered that cuda can't write to textures
> (image2D in OpenCL), so i guess nv opencl image writing is nothing but a big
> (and slow) hack. Using global (gpu) memory instead runs a bit faster (400 vs
> 320 fps). The good is that both opencl and opengl can share buffers (on gpu
> implementations), so opencl could be used for general purpose algos while glsl
> would give its best on pixels.
> Anyway, i've now to try this inside MLT. But then i have a few questions and
> suggestions.
>
> - is it possible to constrain the list of available effects (filters,
> transitions..) based on a consumer property or a factory_init option? The idea

Applications may filter the result. They can use something in the
filter name established as a convention, which will be fastest. A more
rigid way that does not pollute the names is to add a flag to the
metadata schema.

> is to avoid mixing gpu and cpu effects, because even with PBO, downloading data
> from gpu is really slow. That would probably void all performances gain. I
> understand that it requires a full set of glsl effects to be implemented, but
> most would just be a port.

Right. The most critical path I see here is the "normalizing" filters
that are attached by the 'loader' producer per the configuration in
src/modules/core/loader.ini. It would be best to have a filter for
each line you see in that file. Then, the GL variant can be listed
first. As long as the GL filter invalidates itself in its init routine
by returning NULL, then the next filter in the list is tried.

> - how does MLT make use of multithreading? I guess it runs producers in
> separate threads, but is this also the case for filters? Multithreading is not
> really an option with opengl (binding/releasing context is incredibly slow (
> 40fps vs 2200 ), so threads synchronization must happen right before entering
> gl path).

It is all in mlt_consumer.c and governed by the real_time property. It
is best/easiest to think of the output consumer separately and
everything else that constitutes the "pipeline" or graph as its
producer. Frame objects are put into a job queue and then one or more
worker threads process the queue. By default, there is one worker
thread. Then, the consumer runs in a separate thread.

> - should we have a gl consumer that takes care of display in a window provided
> by the frontend (similar to sdl consumer) or should the frontend take care of
> context creation and gl initialisation? The later may have some advantages,

All GL-based applications include Kdenlive and Shotcut use an
audio-only consumer that fires an event when a video frame should be
shown. So, the app retrieves the image from the frame and copies it to
a texture.

> for example a Qt frontend could make sure to have a context compatible with
> QtopenglPaintEngine and take advantage of QPainter to do overpainting (e.g.
> for title or image placement)

You can see my Qt GL code here (much less to look through than Kdenlive):
git://gitorious.org/mltframework/shotcut.git

It only uses GL on OS X, but you can easily modify the factory method
in mltcontroller.cpp to use the GL widget on Linux.

> - how do you retrieve native frame image data format? glsl can upload any
> uchar data and do csc in shaders, so it would be a waste of resource to have
> the cpu doing a csc first.

mlt_frame_get_image() - see Shotcut
A recent change in MLT (git head, since last release) is that you can
use the mlt_image_none as the image format to get an image as close to
the source colorspace as possible (but also in format supported by
MLT). Also, if you use the 'abnormal' producer to load the resource,
then it will not attach the normalizing filters. (However, it still
uses thje other features of the 'loader' producer to use loader.dict
filename extension-matching to choose a suitable producer and attach
the image and audio format conversion filters.)

> - is it possible to disable an effect based on runtime checks? e.g. If an effect
> requires glsl 4.2 and the running platform have glsl 1.5, it should be
> disabled.

Yes, in the service's init routine return NULL.

> P.S.
> Since i'm still unsure to understand well MLT's framework, some of the aboves
> may make no sense :)
>
>
> --
> Christophe Thommeret