Thread: [Mlt-devel] opengl support
Brought to you by:
ddennedy,
lilo_booter
From: Christophe T. <hf...@fr...> - 2011-11-14 20:24:02
|
Hi, First, thanx for this great work, I'm using MLT with the Kdenlive frontend to edit video (mainly avchd) and it does its job quite well. But even with low res video (i use kdenlive's "proxy clip" feature), adding a simple effect like contrast makes the preview to drop frames. I sure don't have the strongest cpu around but i think a E8500 is not so bad. I've recently played a bit with opengl and found that a simple nvidia GT220 can run several glsl filters (say 6, like contrast,brightness...) on a 1920x1080, while still displaying 50fps (deinterlaced 25i) ! So you can imagine what could be done on lowres ! Having a flawless preview is really comfortable :) I see that opengl is in MLT's todo. So i wonder if you have already some thoughts on what should be done and how it should be done. I would be happy to help. -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2011-11-14 21:49:17
|
On Mon, Nov 14, 2011 at 12:23 PM, Christophe Thommeret <hf...@fr...> wrote: > Hi, > > First, thanx for this great work, I'm using MLT with the Kdenlive frontend to > edit video (mainly avchd) and it does its job quite well. > But even with low res video (i use kdenlive's "proxy clip" feature), adding a > simple effect like contrast makes the preview to drop frames. I sure don't have > the strongest cpu around but i think a E8500 is not so bad. > I've recently played a bit with opengl and found that a simple nvidia GT220 > can run several glsl filters (say 6, like contrast,brightness...) on a > 1920x1080, while still displaying 50fps (deinterlaced 25i) ! So you can > imagine what could be done on lowres ! > Having a flawless preview is really comfortable :) > > I see that opengl is in MLT's todo. So i wonder if you have already some > thoughts on what should be done and how it should be done. > I would be happy to help. There is already a WebVfx plugin that supports GLSL via JS in QtWebKit's WebGL. Recently, I added a shader for YCbCr-to-RGB color conversion for the OpenGL output in a new project I am working on, and I see its benefit. In general, it seems difficult and costly performance-wise to fit GPU effects into the framework unless you use them entirely. Otherwise, you end having to do some on the CPU and also keep the full flexibility and expression of arbitrary filter ordering. Also, one should consider OpenCL and how either can be combined with hwaccel decoder output in vendor-independent VA-API, which is still not integrated. I am not willing to lend much effort to a NVIDIA-only solution. Also, my experience with VDPAU shows it gets tricky to keep things stable when you combine it with multiple video tracks, transitions, and parallel processing, which is why it is not enabled by default. I am slowly learning more about where I want to go with this, but not aggressively. Feel free to contribute. You might want to contribute VA-API integration or improve upon VDPAU to get started. Once you want to move into filters, we will need an image type to represent a surface or PBO and its converters with the other image types. -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2011-11-18 21:20:50
|
Le lundi 14 novembre 2011 22:49:09, Dan Dennedy a écrit : > In general, it seems difficult and costly performance-wise to fit GPU > effects into the framework unless you use them entirely. Otherwise, > you end having to do some on the CPU and also keep the full > flexibility and expression of arbitrary filter ordering. Also, one > should consider OpenCL and how either can be combined with hwaccel > decoder output in vendor-independent VA-API, which is still not > integrated. I am not willing to lend much effort to a NVIDIA-only > solution. Also, my experience with VDPAU shows it gets tricky to keep > things stable when you combine it with multiple video tracks, > transitions, and parallel processing, which is why it is not enabled > by default. Hi Dan, I've looked at mlt's vdpau and i see the problem. A vdp decoder maintains a bunch of internal variables that are not exposed to the user through videosurfaces, so if you can't destroy and recreate without issues. I've fixed this by removing the global g_vdpau and adding vdpDevice and vdpDecoder to producer_avformat.vdpau struct. Transitions look good now. Do you have any other reproducible bug in mind ? -- Christophe Thommeret |
From: Christophe T. <hf...@fr...> - 2011-11-19 11:02:44
Attachments:
vdpau.diff
|
Le vendredi 18 novembre 2011 22:20:05, Christophe Thommeret a écrit : > Hi Dan, > > I've looked at mlt's vdpau and i see the problem. A vdp decoder maintains a > bunch of internal variables that are not exposed to the user through > videosurfaces, so if you can't destroy and recreate without issues. > I've fixed this by removing the global g_vdpau and adding vdpDevice and > vdpDecoder to producer_avformat.vdpau struct. Transitions look good now. Do > you have any other reproducible bug in mind ? Patch attached. Seems stable here on vdpau feature set C gpu (Which are probably the most exposed to the problem.) -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2011-11-23 04:11:21
|
On Sat, Nov 19, 2011 at 3:01 AM, Christophe Thommeret <hf...@fr...> wrote: > Le vendredi 18 novembre 2011 22:20:05, Christophe Thommeret a écrit : >> Hi Dan, >> >> I've looked at mlt's vdpau and i see the problem. A vdp decoder maintains a >> bunch of internal variables that are not exposed to the user through >> videosurfaces, so if you can't destroy and recreate without issues. >> I've fixed this by removing the global g_vdpau and adding vdpDevice and >> vdpDecoder to producer_avformat.vdpau struct. Transitions look good now. Do >> you have any other reproducible bug in mind ? You might want to do some testing with real_time=2 or =3 (Processing threads setting in Kdenlive) if you have multiple cores. Other than that, I have no specific bugs to address. Do you know much about VA-API and how to fetched the decoded images from video memory using it? > Patch attached. > Seems stable here on vdpau feature set C gpu (Which are probably the most > exposed to the problem.) Thank you! Would you please try to update your patch against trunk? There was one small change in vdpau.c Nov 3 that broke a big hunk of the patch on vdpau.c. -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2011-11-23 21:19:36
Attachments:
vdpau.diff
|
Le mercredi 23 novembre 2011 05:11:15, Dan Dennedy a écrit : > On Sat, Nov 19, 2011 at 3:01 AM, Christophe Thommeret <hf...@fr...> wrote: > > Le vendredi 18 novembre 2011 22:20:05, Christophe Thommeret a écrit : > >> Hi Dan, > >> > >> I've looked at mlt's vdpau and i see the problem. A vdp decoder > >> maintains a bunch of internal variables that are not exposed to the > >> user through videosurfaces, so if you can't destroy and recreate > >> without issues. I've fixed this by removing the global g_vdpau and > >> adding vdpDevice and vdpDecoder to producer_avformat.vdpau struct. > >> Transitions look good now. Do you have any other reproducible bug in > >> mind ? > > You might want to do some testing with real_time=2 or =3 (Processing > threads setting in Kdenlive) if you have multiple cores. Other than > that, I have no specific bugs to address. Do you know much about > VA-API and how to fetched the decoded images from video memory using > it? I don't know vaapi, but a quick look at xine's vaapi (which uses ffmpeg vaapi) suggest something like : vaCreateImage vaGetImage vaMapBuffer vaUnmapBuffer vaDestroyImage You can find it at https://github.com/huceke/xine-lib- vaapi/blob/vaapi/src/video_out/video_out_vaapi.c in function provide_standard_frame_data You can compare with the corresponding vdpau function at https://github.com/huceke/xine-lib- vaapi/blob/vaapi/src/video_out/video_out_vdpau.c > > Patch attached. > > Seems stable here on vdpau feature set C gpu (Which are probably the most > > exposed to the problem.) > > Thank you! Would you please try to update your patch against trunk? > There was one small change in vdpau.c Nov 3 that broke a big hunk of > the patch on vdpau.c. Oh, sorry. Find attached the (hopefully) good one. -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2011-11-27 21:00:24
|
On Wed, Nov 23, 2011 at 1:18 PM, Christophe Thommeret <hf...@fr...> wrote: > ** > > Le mercredi 23 novembre 2011 05:11:15, Dan Dennedy a écrit : > > > On Sat, Nov 19, 2011 at 3:01 AM, Christophe Thommeret <hf...@fr...> > wrote: > > > > Le vendredi 18 novembre 2011 22:20:05, Christophe Thommeret a écrit : > > > > Patch attached. > > > > Seems stable here on vdpau feature set C gpu (Which are probably the > most > > > > exposed to the problem.) > > > > > > Thank you! Would you please try to update your patch against trunk? > > > There was one small change in vdpau.c Nov 3 that broke a big hunk of > > > the patch on vdpau.c. > > Oh, sorry. > > Find attached the (hopefully) good one. > > -- > > Christophe Thommeret > Yes, basic testing worked fine here. I applied the patch and pushed it. -- +-DRD-+ |
From: Dan D. <da...@de...> - 2011-11-29 19:19:27
|
Christophe, if you are interested in continuing to work in this area, perhaps consider to make a GEGL plugin for MLT as it is continuing to build on its usage of OpenCL: http://libregraphicsworld.org/blog/entry/gegl-developer-sponsored-to-improve-hardware-acceleration-support On Mon, Nov 14, 2011 at 1:49 PM, Dan Dennedy <da...@de...> wrote: > > On Mon, Nov 14, 2011 at 12:23 PM, Christophe Thommeret <hf...@fr...> wrote: > > Hi, > > > > First, thanx for this great work, I'm using MLT with the Kdenlive frontend to > > edit video (mainly avchd) and it does its job quite well. > > But even with low res video (i use kdenlive's "proxy clip" feature), adding a > > simple effect like contrast makes the preview to drop frames. I sure don't have > > the strongest cpu around but i think a E8500 is not so bad. > > I've recently played a bit with opengl and found that a simple nvidia GT220 > > can run several glsl filters (say 6, like contrast,brightness...) on a > > 1920x1080, while still displaying 50fps (deinterlaced 25i) ! So you can > > imagine what could be done on lowres ! > > Having a flawless preview is really comfortable :) > > > > I see that opengl is in MLT's todo. So i wonder if you have already some > > thoughts on what should be done and how it should be done. > > I would be happy to help. > > There is already a WebVfx plugin that supports GLSL via JS in > QtWebKit's WebGL. Recently, I added a shader for YCbCr-to-RGB color > conversion for the OpenGL output in a new project I am working on, and > I see its benefit. > > In general, it seems difficult and costly performance-wise to fit GPU > effects into the framework unless you use them entirely. Otherwise, > you end having to do some on the CPU and also keep the full > flexibility and expression of arbitrary filter ordering. Also, one > should consider OpenCL and how either can be combined with hwaccel > decoder output in vendor-independent VA-API, which is still not > integrated. I am not willing to lend much effort to a NVIDIA-only > solution. Also, my experience with VDPAU shows it gets tricky to keep > things stable when you combine it with multiple video tracks, > transitions, and parallel processing, which is why it is not enabled > by default. > > I am slowly learning more about where I want to go with this, but not > aggressively. Feel free to contribute. You might want to contribute > VA-API integration or improve upon VDPAU to get started. Once you want > to move into filters, we will need an image type to represent a > surface or PBO and its converters with the other image types. > > -- > +-DRD-+ -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2011-11-30 19:30:47
|
Le mardi 29 novembre 2011 20:19:16, Dan Dennedy a écrit : > Christophe, if you are interested in continuing to work in this area, > perhaps consider to make a GEGL plugin for MLT as it is continuing to > build on its usage of OpenCL: > http://libregraphicsworld.org/blog/entry/gegl-developer-sponsored-to-improv > e-hardware-acceleration-support Hi Dan, I have a lot of video shots stored, waiting to be edited. So yes, i'm still interested :) My first task is to learn openCL. Unsurprisingly, the spec looks quite similar to openGL. So i expect the same benefits and the same caveat, ununified memory. So even with this solution, we would still have to stay as much as possible on the same rendering path (until unified memory comes to real life). As soon as possible, I will make some openCL bench to compare to openGL. After that, i will look deeper into gegl, but a first look makes me think that its tile based design does not map very well to openCL. And i don't think that we are interested in 16/32bit in video world where you finally render to yuv420 :) > On Mon, Nov 14, 2011 at 1:49 PM, Dan Dennedy <da...@de...> wrote: > > On Mon, Nov 14, 2011 at 12:23 PM, Christophe Thommeret <hf...@fr...> wrote: > > > Hi, > > > > > > First, thanx for this great work, I'm using MLT with the Kdenlive > > > frontend to edit video (mainly avchd) and it does its job quite well. > > > But even with low res video (i use kdenlive's "proxy clip" feature), > > > adding a simple effect like contrast makes the preview to drop frames. > > > I sure don't have the strongest cpu around but i think a E8500 is not > > > so bad. > > > I've recently played a bit with opengl and found that a simple nvidia > > > GT220 can run several glsl filters (say 6, like > > > contrast,brightness...) on a 1920x1080, while still displaying 50fps > > > (deinterlaced 25i) ! So you can imagine what could be done on lowres ! > > > Having a flawless preview is really comfortable :) > > > > > > I see that opengl is in MLT's todo. So i wonder if you have already > > > some thoughts on what should be done and how it should be done. > > > I would be happy to help. > > > > There is already a WebVfx plugin that supports GLSL via JS in > > QtWebKit's WebGL. Recently, I added a shader for YCbCr-to-RGB color > > conversion for the OpenGL output in a new project I am working on, and > > I see its benefit. > > > > In general, it seems difficult and costly performance-wise to fit GPU > > effects into the framework unless you use them entirely. Otherwise, > > you end having to do some on the CPU and also keep the full > > flexibility and expression of arbitrary filter ordering. Also, one > > should consider OpenCL and how either can be combined with hwaccel > > decoder output in vendor-independent VA-API, which is still not > > integrated. I am not willing to lend much effort to a NVIDIA-only > > solution. Also, my experience with VDPAU shows it gets tricky to keep > > things stable when you combine it with multiple video tracks, > > transitions, and parallel processing, which is why it is not enabled > > by default. > > > > I am slowly learning more about where I want to go with this, but not > > aggressively. Feel free to contribute. You might want to contribute > > VA-API integration or improve upon VDPAU to get started. Once you want > > to move into filters, we will need an image type to represent a > > surface or PBO and its converters with the other image types. > > > > -- > > +-DRD-+ > > -- > +-DRD-+ -- Christophe Thommeret |
From: Christophe T. <hf...@fr...> - 2011-12-20 19:59:36
|
Le mercredi 30 novembre 2011 20:28:55, Christophe Thommeret a écrit : > My first task is to learn openCL. Unsurprisingly, the spec looks quite > similar to openGL. So i expect the same benefits and the same caveat, > ununified memory. So even with this solution, we would still have to stay > as much as possible on the same rendering path (until unified memory comes > to real life). As soon as possible, I will make some openCL bench to > compare to openGL. Hi Dan, Yes, openCL has some interesting features. Bt at that moment, my first experiments are quite disapointing. First, it's really difficult to optimize for speed. It really requires intimate knowledge of the targeted hardware to get the best from it. I've written a bicubic scaling filter for both openGL and openCL, and after several hours of reading, adjusting and testing using all available nvidia's gpu documentation (mostly found in cuda), the openCL version is still far from openGL performances. To upscale a 720x576 image to 1920x1080, openCL is about 3x slower : 323 fps vs 1150. -- Christophe Thommeret |
From: Christophe T. <hf...@fr...> - 2011-12-20 20:01:14
|
Le mardi 20 décembre 2011 20:59:36, Christophe Thommeret a écrit : > Le mercredi 30 novembre 2011 20:28:55, Christophe Thommeret a écrit : > > My first task is to learn openCL. Unsurprisingly, the spec looks quite > > similar to openGL. So i expect the same benefits and the same caveat, > > ununified memory. So even with this solution, we would still have to stay > > as much as possible on the same rendering path (until unified memory > > comes to real life). As soon as possible, I will make some openCL bench > > to compare to openGL. > > Hi Dan, > > Yes, openCL has some interesting features. Bt at that moment, my first > experiments are quite disapointing. > First, it's really difficult to optimize for speed. It really requires > intimate knowledge of the targeted hardware to get the best from it. > I've written a bicubic scaling filter for both openGL and openCL, and after > several hours of reading, adjusting and testing using all available > nvidia's gpu documentation (mostly found in cuda), the openCL version is > still far from openGL performances. To upscale a 720x576 image to > 1920x1080, openCL is about 3x slower : 323 fps vs 1150. (sent a bit too fast :) Still working on this, though... -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2011-12-22 06:07:19
|
On Tue, Dec 20, 2011 at 11:59 AM, Christophe Thommeret <hf...@fr...> wrote: > Le mercredi 30 novembre 2011 20:28:55, Christophe Thommeret a écrit : >> My first task is to learn openCL. Unsurprisingly, the spec looks quite >> similar to openGL. So i expect the same benefits and the same caveat, >> ununified memory. So even with this solution, we would still have to stay >> as much as possible on the same rendering path (until unified memory comes >> to real life). As soon as possible, I will make some openCL bench to >> compare to openGL. > > Hi Dan, > > Yes, openCL has some interesting features. Bt at that moment, my first > experiments are quite disapointing. > First, it's really difficult to optimize for speed. It really requires intimate > knowledge of the targeted hardware to get the best from it. > I've written a bicubic scaling filter for both openGL and openCL, and after > several hours of reading, adjusting and testing using all available nvidia's > gpu documentation (mostly found in cuda), the openCL version is still far from > openGL performances. To upscale a 720x576 image to 1920x1080, openCL is about > 3x slower : 323 fps vs 1150. Interesting findings, and maybe this is due to immaturity of OpenCL implementations. Thank you for the update. GLSL is interesting too. I believe both avenues may face inconsistent levels of support across vendors and chips. -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2012-01-04 19:29:40
|
Le jeudi 22 décembre 2011 07:07:13, Dan Dennedy a écrit : > > Yes, openCL has some interesting features. Bt at that moment, my first > > experiments are quite disapointing. > > First, it's really difficult to optimize for speed. It really requires > > intimate knowledge of the targeted hardware to get the best from it. > > I've written a bicubic scaling filter for both openGL and openCL, and > > after several hours of reading, adjusting and testing using all > > available nvidia's gpu documentation (mostly found in cuda), the openCL > > version is still far from openGL performances. To upscale a 720x576 > > image to 1920x1080, openCL is about 3x slower : 323 fps vs 1150. > > Interesting findings, and maybe this is due to immaturity of OpenCL > implementations. Thank you for the update. GLSL is interesting too. I > believe both avenues may face inconsistent levels of support across > vendors and chips. Yes, OpenCL implementations are probably immature. The nvidia's one is more or less a cuda wrapper, and i've discovered that cuda can't write to textures (image2D in OpenCL), so i guess nv opencl image writing is nothing but a big (and slow) hack. Using global (gpu) memory instead runs a bit faster (400 vs 320 fps). The good is that both opencl and opengl can share buffers (on gpu implementations), so opencl could be used for general purpose algos while glsl would give its best on pixels. Anyway, i've now to try this inside MLT. But then i have a few questions and suggestions. - is it possible to constrain the list of available effects (filters, transitions..) based on a consumer property or a factory_init option? The idea is to avoid mixing gpu and cpu effects, because even with PBO, downloading data from gpu is really slow. That would probably void all performances gain. I understand that it requires a full set of glsl effects to be implemented, but most would just be a port. - how does MLT make use of multithreading? I guess it runs producers in separate threads, but is this also the case for filters? Multithreading is not really an option with opengl (binding/releasing context is incredibly slow ( 40fps vs 2200 ), so threads synchronization must happen right before entering gl path). - should we have a gl consumer that takes care of display in a window provided by the frontend (similar to sdl consumer) or should the frontend take care of context creation and gl initialisation? The later may have some advantages, for example a Qt frontend could make sure to have a context compatible with QtopenglPaintEngine and take advantage of QPainter to do overpainting (e.g. for title or image placement) - how do you retrieve native frame image data format? glsl can upload any uchar data and do csc in shaders, so it would be a waste of resource to have the cpu doing a csc first. - is it possible to disable an effect based on runtime checks? e.g. If an effect requires glsl 4.2 and the running platform have glsl 1.5, it should be disabled. P.S. Since i'm still unsure to understand well MLT's framework, some of the aboves may make no sense :) -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2012-01-04 22:00:49
|
On Wed, Jan 4, 2012 at 11:29 AM, Christophe Thommeret <hf...@fr...> wrote: > Le jeudi 22 décembre 2011 07:07:13, Dan Dennedy a écrit : >> > Yes, openCL has some interesting features. Bt at that moment, my first >> > experiments are quite disapointing. >> > First, it's really difficult to optimize for speed. It really requires >> > intimate knowledge of the targeted hardware to get the best from it. >> > I've written a bicubic scaling filter for both openGL and openCL, and >> > after several hours of reading, adjusting and testing using all >> > available nvidia's gpu documentation (mostly found in cuda), the openCL >> > version is still far from openGL performances. To upscale a 720x576 >> > image to 1920x1080, openCL is about 3x slower : 323 fps vs 1150. >> >> Interesting findings, and maybe this is due to immaturity of OpenCL >> implementations. Thank you for the update. GLSL is interesting too. I >> believe both avenues may face inconsistent levels of support across >> vendors and chips. > > Yes, OpenCL implementations are probably immature. The nvidia's one is more or > less a cuda wrapper, and i've discovered that cuda can't write to textures > (image2D in OpenCL), so i guess nv opencl image writing is nothing but a big > (and slow) hack. Using global (gpu) memory instead runs a bit faster (400 vs > 320 fps). The good is that both opencl and opengl can share buffers (on gpu > implementations), so opencl could be used for general purpose algos while glsl > would give its best on pixels. > Anyway, i've now to try this inside MLT. But then i have a few questions and > suggestions. > > - is it possible to constrain the list of available effects (filters, > transitions..) based on a consumer property or a factory_init option? The idea Applications may filter the result. They can use something in the filter name established as a convention, which will be fastest. A more rigid way that does not pollute the names is to add a flag to the metadata schema. > is to avoid mixing gpu and cpu effects, because even with PBO, downloading data > from gpu is really slow. That would probably void all performances gain. I > understand that it requires a full set of glsl effects to be implemented, but > most would just be a port. Right. The most critical path I see here is the "normalizing" filters that are attached by the 'loader' producer per the configuration in src/modules/core/loader.ini. It would be best to have a filter for each line you see in that file. Then, the GL variant can be listed first. As long as the GL filter invalidates itself in its init routine by returning NULL, then the next filter in the list is tried. > - how does MLT make use of multithreading? I guess it runs producers in > separate threads, but is this also the case for filters? Multithreading is not > really an option with opengl (binding/releasing context is incredibly slow ( > 40fps vs 2200 ), so threads synchronization must happen right before entering > gl path). It is all in mlt_consumer.c and governed by the real_time property. It is best/easiest to think of the output consumer separately and everything else that constitutes the "pipeline" or graph as its producer. Frame objects are put into a job queue and then one or more worker threads process the queue. By default, there is one worker thread. Then, the consumer runs in a separate thread. > - should we have a gl consumer that takes care of display in a window provided > by the frontend (similar to sdl consumer) or should the frontend take care of > context creation and gl initialisation? The later may have some advantages, All GL-based applications include Kdenlive and Shotcut use an audio-only consumer that fires an event when a video frame should be shown. So, the app retrieves the image from the frame and copies it to a texture. > for example a Qt frontend could make sure to have a context compatible with > QtopenglPaintEngine and take advantage of QPainter to do overpainting (e.g. > for title or image placement) You can see my Qt GL code here (much less to look through than Kdenlive): git://gitorious.org/mltframework/shotcut.git It only uses GL on OS X, but you can easily modify the factory method in mltcontroller.cpp to use the GL widget on Linux. > - how do you retrieve native frame image data format? glsl can upload any > uchar data and do csc in shaders, so it would be a waste of resource to have > the cpu doing a csc first. mlt_frame_get_image() - see Shotcut A recent change in MLT (git head, since last release) is that you can use the mlt_image_none as the image format to get an image as close to the source colorspace as possible (but also in format supported by MLT). Also, if you use the 'abnormal' producer to load the resource, then it will not attach the normalizing filters. (However, it still uses thje other features of the 'loader' producer to use loader.dict filename extension-matching to choose a suitable producer and attach the image and audio format conversion filters.) > - is it possible to disable an effect based on runtime checks? e.g. If an effect > requires glsl 4.2 and the running platform have glsl 1.5, it should be > disabled. Yes, in the service's init routine return NULL. > P.S. > Since i'm still unsure to understand well MLT's framework, some of the aboves > may make no sense :) > > > -- > Christophe Thommeret |
From: Christophe T. <hf...@fr...> - 2012-02-09 21:03:24
|
Hi Dan, I can now run "melt -consumer qgl clip.xxx" In this case, glsl.csc is created instead of avcolor_space. It does rgb24/rgb24a/yuv420p/yuv422 to gl texture. It calls mlt_frame_set_image( frame, (uint8_t*)dest, sizeof(struct glsl_texture_s), g->texture_destructor ); and sets "format" to mlt_image_glsl. then the qgl consumer (a QGLWidget) displays dest->texture. This works well for yuv422 and rgb24, but it never gets yuv420p, all yuv420 streams are converted somewhere (i guess the producer) to yuv422 before reaching the csc filter. Is there any way to disable this initial conversion ? Not only i could validate the yuv420p_to_glsl code but it would also minimize a bit the amount of data to upload. (Note : normalisers are disabled.) -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2012-02-09 22:34:21
|
On Thu, Feb 9, 2012 at 1:03 PM, Christophe Thommeret <hf...@fr...> wrote: > Hi Dan, > > I can now run "melt -consumer qgl clip.xxx" > In this case, glsl.csc is created instead of avcolor_space. > It does rgb24/rgb24a/yuv420p/yuv422 to gl texture. > It calls > mlt_frame_set_image( frame, (uint8_t*)dest, > sizeof(struct glsl_texture_s), g->texture_destructor ); > > and sets "format" to mlt_image_glsl. why not the existing mlt_image_opengl? There are some legacy cases of code doing something with mlt_image_opengl treating it the same as mlt_image_rgb24a, but we can clean that up or switch to it later. > then the qgl consumer (a QGLWidget) displays dest->texture. I think something like QGLWidget really belongs more in a Qt app than MLT. Also, then the app has more flexibility in choosing a audio-only consumer, but if it makes it dev easier by simply working with melt, I understand. > This works well for yuv422 and rgb24, but it never gets yuv420p, all yuv420 If you have an image conversion filter properly applied (see core/producer_loader.c), then you can get yuv420p by using mlt_frame_get_image() but not by directly accessing the properties. However, you seem to be mucking around with the image converters and maybe screwed up something. How did you do "glsl.csc is created instead of avcolor_space?" If you are replacing an image converter (avcolor_space is an image converter), then it is your responsibility to convert from whatever to yuv420. > streams are converted somewhere (i guess the producer) to yuv422 before > reaching the csc filter. Is there any way to disable this initial conversion ? The producer is not required to give you what you ask for. For example, image-based producers generally always give rgb24 or rgb24a. Hence, the image converter is there to convert it for the mlt_frame_get_image() caller. However, the avformat producer does generally give you what you ask for. Note also that with consumer.real_time <> 0 there is a thread running to pre-render frames, and that thread defaults to mlt_image_yuv422, but you can change it by setting property mlt_image_format=yuv420p on the consumer (added since last release). > Not only i could validate the yuv420p_to_glsl code but it would also minimize > a bit the amount of data to upload. > > (Note : normalisers are disabled.) > > -- > Christophe Thommeret |
From: Christophe T. <hf...@fr...> - 2012-02-09 23:12:20
|
Le jeudi 9 février 2012 23:34:14, Dan Dennedy a écrit : > On Thu, Feb 9, 2012 at 1:03 PM, Christophe Thommeret <hf...@fr...> wrote: > > Hi Dan, > > > > I can now run "melt -consumer qgl clip.xxx" > > In this case, glsl.csc is created instead of avcolor_space. > > It does rgb24/rgb24a/yuv420p/yuv422 to gl texture. > > It calls > > mlt_frame_set_image( frame, (uint8_t*)dest, > > sizeof(struct glsl_texture_s), g->texture_destructor ); > > > > and sets "format" to mlt_image_glsl. > > why not the existing mlt_image_opengl? There are some legacy cases of > code doing something with mlt_image_opengl treating it the same as > mlt_image_rgb24a, but we can clean that up or switch to it later. I wasn't sure what mlt_image_opengl was for. So i went for _glsl, also thinking that if later MLT gets opencl support, the names might be confusing. But that's not a big deal to change to something other. > > then the qgl consumer (a QGLWidget) displays dest->texture. > > I think something like QGLWidget really belongs more in a Qt app than > MLT. Also, then the app has more flexibility in choosing a audio-only > consumer, but if it makes it dev easier by simply working with melt, I > understand. Exactly, i need an easy way to test, so writting a consumer (very basic, no audio, fixed ~25fps) was much easier than porting something like kdenlive :) > > This works well for yuv422 and rgb24, but it never gets yuv420p, all > > yuv420 > > If you have an image conversion filter properly applied (see > core/producer_loader.c), then you can get yuv420p by using > mlt_frame_get_image() but not by directly accessing the properties. > However, you seem to be mucking around with the image converters and > maybe screwed up something. How did you do "glsl.csc is created > instead of avcolor_space?" If you are replacing an image converter > (avcolor_space is an image converter), then it is your responsibility > to convert from whatever to yuv420. I've modified producer_loader_init to create glsl.csc when the consumer sets the global_property data "glsl_env" (a pointer to an instance of a class(well, a struct) that contains everything needed by filters, creating/caching fbo, textures and shaders, locking the gl context (a context can only be made current in one thread at any time), etc...) > > streams are converted somewhere (i guess the producer) to yuv422 before > > reaching the csc filter. Is there any way to disable this initial > > conversion ? > > The producer is not required to give you what you ask for. For > example, image-based producers generally always give rgb24 or rgb24a. > Hence, the image converter is there to convert it for the > mlt_frame_get_image() caller. However, the avformat producer does > generally give you what you ask for. The video thread calls : mlt_image_format vfmt = mlt_image_glsl; int width = 0, height = 0; uint8_t *image; mlt_frame_get_image( frame, &image, &vfmt, &width, &height, 0 ); and the converter gets yuv422. > Note also that with consumer.real_time <> 0 there is a thread running > to pre-render frames, and that thread defaults to mlt_image_yuv422, > but you can change it by setting property mlt_image_format=yuv420p on > the consumer (added since last release). I'm running with the default real_time=1. Could you tell me more about setting this consumer' format property ? > > Not only i could validate the yuv420p_to_glsl code but it would also > > minimize a bit the amount of data to upload. > > > > (Note : normalisers are disabled.) > > > > -- > > Christophe Thommeret -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2012-02-10 03:50:37
|
On Thu, Feb 9, 2012 at 3:12 PM, Christophe Thommeret <hf...@fr...> wrote: > Le jeudi 9 février 2012 23:34:14, Dan Dennedy a écrit : >> Note also that with consumer.real_time <> 0 there is a thread running >> to pre-render frames, and that thread defaults to mlt_image_yuv422, >> but you can change it by setting property mlt_image_format=yuv420p on >> the consumer (added since last release). > > I'm running with the default real_time=1. > Could you tell me more about setting this consumer' format property ? > When real_time != 0, mlt_consumer.c spawns abs(real_time) threads that loop fetching a frame and calling mlt_frame_get_image(), and that call has a format parameter. It gets the format from the consumer property named "mlt_image_format," which accepts the names: yuv420p, yuv422, rgb24, rgb24a, or none. In December, I added support for mlt_image_none for a customer. This asks mlt_frame_get_image() to return the format that is closest to its native format. But the caller can really only expect that when there are no filters including normalizing filters. Maybe that is what you want if you want to offload as much conversion to glsl. Or, maybe you want yuv420p to reduce the bus bandwidth. P.S. a way to request no normalizing filters at runtime, is to use the abnormal producer, and the easiest way to invoke that is to prefix the resource name with "abnormal:" -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2012-02-10 14:51:10
|
Le vendredi 10 février 2012 04:50:31, Dan Dennedy a écrit : > On Thu, Feb 9, 2012 at 3:12 PM, Christophe Thommeret <hf...@fr...> wrote: > > Le jeudi 9 février 2012 23:34:14, Dan Dennedy a écrit : > >> Note also that with consumer.real_time <> 0 there is a thread running > >> to pre-render frames, and that thread defaults to mlt_image_yuv422, > >> but you can change it by setting property mlt_image_format=yuv420p on > >> the consumer (added since last release). > > > > I'm running with the default real_time=1. > > Could you tell me more about setting this consumer' format property ? > > When real_time != 0, mlt_consumer.c spawns abs(real_time) threads that > loop fetching a frame and calling mlt_frame_get_image(), and that call > has a format parameter. It gets the format from the consumer property > named "mlt_image_format," which accepts the names: yuv420p, yuv422, > rgb24, rgb24a, or none. > > In December, I added support for mlt_image_none for a customer. This > asks mlt_frame_get_image() to return the format that is closest to its > native format. But the caller can really only expect that when there > are no filters including normalizing filters. Ok, setting this property to "yuv420p" or "none", glsl.csc gets yuv420p. But whatever i set, when i add a filter (e.g. "melt -consumer qgl -filter glsl.greyscale /path/to/file"), melt crashes. I found that it happens in producer_avformat, where allocate_buffer gets mlt_image_glsl. > Maybe that is what you > want if you want to offload as much conversion to glsl. Or, maybe you > want yuv420p to reduce the bus bandwidth. At that moment, all i want is to understand how it works. yuv420p reduces cpu usage by ~10%, but if it's not the native format, the conversion made by producer_avformat may take more. > P.S. a way to request no normalizing filters at runtime, is to use the > abnormal producer, and the easiest way to invoke that is to prefix the > resource name with "abnormal:" If you mean "melt -consumer qgl abnormal:/path/to/file" then it does not work, attach_normalisers is called. (so, i disable attach_normalisers in producer_loader) -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2012-02-11 18:52:46
|
On Fri, Feb 10, 2012 at 6:51 AM, Christophe Thommeret <hf...@fr...> wrote: >> P.S. a way to request no normalizing filters at runtime, is to use the >> abnormal producer, and the easiest way to invoke that is to prefix the >> resource name with "abnormal:" > > If you mean > "melt -consumer qgl abnormal:/path/to/file" > then it does not work, attach_normalisers is called. > (so, i disable attach_normalisers in producer_loader) Yeah, I just realized that "abnormal:" did not work through melt, only via API. I just fixed that. -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2012-02-11 10:30:18
|
Le vendredi 10 février 2012 15:51:01, Christophe Thommeret a écrit : > Le vendredi 10 février 2012 04:50:31, Dan Dennedy a écrit : > > On Thu, Feb 9, 2012 at 3:12 PM, Christophe Thommeret <hf...@fr...> wrote: > > > Le jeudi 9 février 2012 23:34:14, Dan Dennedy a écrit : > > >> Note also that with consumer.real_time <> 0 there is a thread running > > >> to pre-render frames, and that thread defaults to mlt_image_yuv422, > > >> but you can change it by setting property mlt_image_format=yuv420p on > > >> the consumer (added since last release). > > > > > > I'm running with the default real_time=1. > > > Could you tell me more about setting this consumer' format property ? > > > > When real_time != 0, mlt_consumer.c spawns abs(real_time) threads that > > loop fetching a frame and calling mlt_frame_get_image(), and that call > > has a format parameter. It gets the format from the consumer property > > named "mlt_image_format," which accepts the names: yuv420p, yuv422, > > rgb24, rgb24a, or none. > > > > In December, I added support for mlt_image_none for a customer. This > > asks mlt_frame_get_image() to return the format that is closest to its > > native format. But the caller can really only expect that when there > > are no filters including normalizing filters. > > Ok, setting this property to "yuv420p" or "none", glsl.csc gets yuv420p. > But whatever i set, when i add a filter (e.g. "melt -consumer qgl -filter > glsl.greyscale /path/to/file"), melt crashes. I found that it happens in > producer_avformat, where allocate_buffer gets mlt_image_glsl. Actually, i fix this by adding this in allocate_buffer: if ( *format == mlt_image_glsl ) *format = pick_format( codec_context->pix_fmt ); that may be not the correct fix, but at least it works. Is there other producers that do internal csc? -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2012-02-11 18:16:44
|
On Sat, Feb 11, 2012 at 2:30 AM, Christophe Thommeret <hf...@fr...> wrote: > Le vendredi 10 février 2012 15:51:01, Christophe Thommeret a écrit : >> Le vendredi 10 février 2012 04:50:31, Dan Dennedy a écrit : >> > On Thu, Feb 9, 2012 at 3:12 PM, Christophe Thommeret <hf...@fr...> > wrote: >> > > Le jeudi 9 février 2012 23:34:14, Dan Dennedy a écrit : >> > >> Note also that with consumer.real_time <> 0 there is a thread running >> > >> to pre-render frames, and that thread defaults to mlt_image_yuv422, >> > >> but you can change it by setting property mlt_image_format=yuv420p on >> > >> the consumer (added since last release). >> > > >> > > I'm running with the default real_time=1. >> > > Could you tell me more about setting this consumer' format property ? >> > >> > When real_time != 0, mlt_consumer.c spawns abs(real_time) threads that >> > loop fetching a frame and calling mlt_frame_get_image(), and that call >> > has a format parameter. It gets the format from the consumer property >> > named "mlt_image_format," which accepts the names: yuv420p, yuv422, >> > rgb24, rgb24a, or none. >> > >> > In December, I added support for mlt_image_none for a customer. This >> > asks mlt_frame_get_image() to return the format that is closest to its >> > native format. But the caller can really only expect that when there >> > are no filters including normalizing filters. >> >> Ok, setting this property to "yuv420p" or "none", glsl.csc gets yuv420p. >> But whatever i set, when i add a filter (e.g. "melt -consumer qgl -filter >> glsl.greyscale /path/to/file"), melt crashes. I found that it happens in >> producer_avformat, where allocate_buffer gets mlt_image_glsl. > > Actually, i fix this by adding this in allocate_buffer: > > if ( *format == mlt_image_glsl ) > *format = pick_format( codec_context->pix_fmt ); > > that may be not the correct fix, but at least it works. > Is there other producers that do internal csc? No -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2012-02-27 17:19:21
|
Just to let you know that i'm progressing. melt -profile atsc_1080p_50 -consumer qgl -filter glsl.saturation:1,1 -filter glsl.gamma:1,1 -filter glsl.brightness:0,01 -filter glsl.contrast:1,2 /home/cris/Videos/h264/clip1-720p50.mkv in=500 out=1899 -track -blank 49 - filter glsl.saturation:1,1 -filter glsl.gamma:1,1 -filter glsl.brightness:0,01 - filter glsl.contrast:1,2 /home/cris/Videos/h264/clip2-720p50.mkv -transition glsl.luma in=49 out=1399 a_track=0 b_track=1 -- Christophe Thommeret |
From: Dan D. <da...@de...> - 2012-03-01 18:00:16
|
On Mon, Feb 27, 2012 at 9:19 AM, Christophe Thommeret <hf...@fr...> wrote: > Just to let you know that i'm progressing. > > melt -profile atsc_1080p_50 -consumer qgl -filter glsl.saturation:1,1 -filter > glsl.gamma:1,1 -filter glsl.brightness:0,01 -filter glsl.contrast:1,2 > /home/cris/Videos/h264/clip1-720p50.mkv in=500 out=1899 -track -blank 49 - > filter glsl.saturation:1,1 -filter glsl.gamma:1,1 -filter glsl.brightness:0,01 - > filter glsl.contrast:1,2 /home/cris/Videos/h264/clip2-720p50.mkv -transition > glsl.luma in=49 out=1399 a_track=0 b_track=1 congratulations on your progress! I encourage you to focus on getting it in shape for merging and exposing a tree for review instead of building out filters. The more you build separately, the longer it will take to review. Also, something to keep in mind is that there is the MLT WebVfx plugins that supports using QML for MLT plugins and this new post from Qt: http://labs.qt.nokia.com/2012/02/29/pimp-my-video-shader-effects-and-multimedia/ -- +-DRD-+ |
From: Christophe T. <hf...@fr...> - 2012-03-05 12:56:18
|
Le jeudi 1 mars 2012 19:00:05, Dan Dennedy a écrit : > On Mon, Feb 27, 2012 at 9:19 AM, Christophe Thommeret <hf...@fr...> wrote: > > Just to let you know that i'm progressing. > > > > melt -profile atsc_1080p_50 -consumer qgl -filter glsl.saturation:1,1 > > -filter glsl.gamma:1,1 -filter glsl.brightness:0,01 -filter > > glsl.contrast:1,2 /home/cris/Videos/h264/clip1-720p50.mkv in=500 > > out=1899 -track -blank 49 - filter glsl.saturation:1,1 -filter > > glsl.gamma:1,1 -filter glsl.brightness:0,01 - filter glsl.contrast:1,2 > > /home/cris/Videos/h264/clip2-720p50.mkv -transition glsl.luma in=49 > > out=1399 a_track=0 b_track=1 > > congratulations on your progress! I encourage you to focus on getting > it in shape for merging and exposing a tree for review Sure. Not yet ready for review, but you can already try to clone http://hftom.homelinux.org/mlt.git (hope it works). I will soon write some README to explain some implemantation details. -- Christophe Thommeret |