From: Matt L. <mat...@gm...> - 2005-12-23 17:28:00
|
Dear All, I would like to propose a new video library to ultimately replace vidl. Th= e current vidl library has been crudely ported multiple times (once by me), and its aging design is starting to show. Here are a few of the troubles with the current vidl library: 1) It is messy (from the crude porting) 2) The API does not support setting encoding options other than file format. 3) It treats a video as an object. This requires you to have the entire video in memory to save it to disk. 4) Few codecs are actually implemented, and some (such as MPEG) are very buggy. 5) It does not support live video streams for video capture. It seems that many people have ignored this library in favor of writing new code to meet their needs. Some examples of this are contrib/mul/mvl2 - AVI file I/O on both windows and linux contrib/brl/vvid - video capture in Windows using IEEE 1394 camera driver= s contrib/oul/oufgl - video-4-linux framegrabber The new library I propose will treat video as an input or output stream of images. You can open an input stream (which may be an encoded video file, = a sequence of images, a camera, etc.) and read images one at a time. Similarly you can open an output stream and write images one at a time. Some streams will support seeking and others will not. We probably also need a mechanism to provide a time stamp with each frame. I have already started to implement this design in contrib/brl/bbas/vidl2. I have written stream classes for a list of images (using vil and vul), and I am working on stream classes to interface with FFMPEG to read most types of encoded files on Linux (and possibly Windows). Before I get too far I would like to get some feedback from those in the VXL community who work with video. I want to make this design meet the needs of as many people as possible. Thank You and Happy Holidays, Matt Leotta Brown University |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2005-12-28 01:29:48
|
Hello Matt, I'm interested in this. Coincidently, I've been putting together some code to acquire images from a Matrox Meteor-II Frame Grabber (using MIL-Lite 7) and also using DirectShow for accessing cameras not connected to the matrox frame grabbers (e.g., usb cameras, or MiniDV). Although I see the similarity, I'm not sure that video and live feeds should be handled through the same interface. It is appealing to me and the following is more of a questions session to try and identify the dissimilarities and how to address them through a simple and coherent interface. Please bear in mind that I'm not an expert on any of these issues, so pardon any naive missconceptions I may have. 1. One of the differences I see with video feeds and live camera feeds (i.e., from frame grabbers, tv tuners, cameras, etc.) is that video feeds are usually encoded in a specific way, while the live feed can vary many different parameters, such as resolution, exposure, gain, brightness, etc. (e.g., as in the brl/vvid library). In this case the creation of a live feed object is usually more complex. I like the approach used in brl/vvid where a cmu_1394_camera_params class contains all the parameters and you create the live feed object, cmu_1394_camera based on this. The way I am thinking of it is to have live feeds (e.g., vidl2_cmu1394_istream, vidl2_v4l_istream, vidl2_directshow_istream) inherit from vidl2_istream and a parameters class hierarchy that inherits from vidl2_istream_params, which would encapsulate the parameters used for creation. Then you could write code like: vidl2_directshow_istream_params params(config_file); vidl2_directshow_istream ds_stream(params); or // config_file has "type: DirectShow" or similar concept vbl_smart_ptr<vidl2_istream_params> params = vidl2_istream_params::create_params(config_file); vbl_smart_ptr<vidl2_istream> stream = params->create_stream(); 2. My other concern is asynch vs synch acquisition. I think this is crucial for video capture applications in which for example multiple frame grabber boards are used and you would want to start acquiring from both "pseudo-"simultaneously and not wait until one has finished to be able to start the next. I think this should be part of the vidl2_istream interface in a similar way that seekable streams are handled. In this case, functions such as is_asynch_capable(), read_frame_asynch(), is_capture_finished(), etc.) 3. Another topic, which I'm definetely not familiar with (but need to do the research on) is the writing to disk. In many live video processing applications you might not need to write to disk, while in others your sole need is to capture the stream to disk for offline processing. In this case I am assuming that the frame grabbers have mechanisms to directly transfer the acquired data from it's memory to the hard drive with minimum utilization of the CPU. In this case, there is a need to be able to connect a vidl2_istream with a vidl2_ostream with as little overhead as possible, specifically without the extra image copying. Well, I think these are my main thoughts so far... Again, I'm interested in this and I would contribute to the creation of a directshow object stream and a wrapper around MIL Lite 7.0. Thanks and Happy Holidays to everyone as well. Miguel -- Miguel A. Figueroa-Villanueva Michigan State University Matt Leotta wrote: > Dear All, > > I would like to propose a new video library to ultimately replace vidl. > The current vidl library has been crudely ported multiple times (once by > me), and its aging design is starting to show. Here are a few of the > troubles with the current vidl library: > > 1) It is messy (from the crude porting) > 2) The API does not support setting encoding options other than file > format. > 3) It treats a video as an object. This requires you to have the > entire video in memory to save it to disk. > 4) Few codecs are actually implemented, and some (such as MPEG) are > very buggy. > 5) It does not support live video streams for video capture. > > It seems that many people have ignored this library in favor of writing > new code to meet their needs. Some examples of this are > contrib/mul/mvl2 - AVI file I/O on both windows and linux > contrib/brl/vvid - video capture in Windows using IEEE 1394 camera drivers > contrib/oul/oufgl - video-4-linux framegrabber > > The new library I propose will treat video as an input or output stream > of images. You can open an input stream (which may be an encoded video > file, a sequence of images, a camera, etc.) and read images one at a > time. Similarly you can open an output stream and write images one at a > time. Some streams will support seeking and others will not. We > probably also need a mechanism to provide a time stamp with each frame. > > I have already started to implement this design in > contrib/brl/bbas/vidl2. I have written stream classes for a list of > images (using vil and vul), and I am working on stream classes to > interface with FFMPEG to read most types of encoded files on Linux (and > possibly Windows). Before I get too far I would like to get some > feedback from those in the VXL community who work with video. I want to > make this design meet the needs of as many people as possible. > > Thank You and Happy Holidays, > Matt Leotta > Brown University > > |
From: Matt L. <mat...@gm...> - 2006-01-03 17:33:50
|
Miguel, Thanks for your interest. It would be great to have someone else contributing to this library, especially on the Windows side since I work primarily in Linux. I'll try to address your concerns to the best of my ability and hopefully we can work out a common interface. > 1. One of the differences I see with video feeds and live camera feeds > (i.e., from frame grabbers, tv tuners, cameras, etc.) is that video > feeds are usually encoded in a specific way, while the live feed can > vary many different parameters, such as resolution, exposure, gain, > brightness, etc. (e.g., as in the brl/vvid library). In this case the > creation of a live feed object is usually more complex. I like the > approach used in brl/vvid where a cmu_1394_camera_params class contains > all the parameters and you create the live feed object, cmu_1394_camera > based on this. I agree. Some video feeds will have many parameters while others will have very few. My hope was that we could always specify the parameters during stream construction (or in some initialization routine) and then simply request frames. This way we would have a common interface to cameras and encoded video files. > The way I am thinking of it is to have live feeds (e.g., > vidl2_cmu1394_istream, vidl2_v4l_istream, vidl2_directshow_istream) > inherit from vidl2_istream and a parameters class hierarchy that > inherits from vidl2_istream_params, which would encapsulate the > parameters used for creation. Absolutely. A parameters structure would be very useful for camera streams when there are many parameters to work with. However, I'm not so sure about the reason for a parameters class hierarchy. Is there really a well defined hierarchy for parameters classes? What is the advantage over simply defining a parameters class for each stream that needs one? > 2. My other concern is asynch vs synch acquisition. I think this is > crucial for video capture applications in which for example multiple > frame grabber boards are used and you would want to start acquiring from > both "pseudo-"simultaneously and not wait until one has finished to be > able to start the next. I think this should be part of the vidl2_istream > interface in a similar way that seekable streams are handled. In this > case, functions such as is_asynch_capable(), read_frame_asynch(), > is_capture_finished(), etc.) That's a good point, but maybe we can handle this with a more general interface. If we had a is_image_ready() function in the base vidl2_istream class it could act as is_capture_finished() but for all input streams. This function would always return true for other types of input streams. Then you could treat all input streams as asynch capable and you also wouldn't need a separate read_frame_asynch() function. Would this handle what you are talking about? I don't have much experience in this area so let me know if I am missing the point. Of course, the other issue is synchronized collection. I haven't looked into that much. > 3. Another topic, which I'm definetely not familiar with (but need to do > the research on) is the writing to disk. In many live video processing > applications you might not need to write to disk, while in others your > sole need is to capture the stream to disk for offline processing. In > this case I am assuming that the frame grabbers have mechanisms to > directly transfer the acquired data from it's memory to the hard drive > with minimum utilization of the CPU. In this case, there is a need to be > able to connect a vidl2_istream with a vidl2_ostream with as little > overhead as possible, specifically without the extra image copying. I've also been thinking about this. Connecting a vidl2_istream to a vidl2_ostream should be as simple as passing each vil_image_resource_sptr from vidl2_istream.read_frame() to vidl2_ostream.write_frame(). The smart pointer will prevent extra copying of the image data. However, a bigger concern might be memory allocation. The istream will need to allocate new memory for each frame. The user will typically not want to overwrite the data from the previous frame. Yet, in some cases this is exactly what we want because it is faster and we no longer need the previous frame after it is used. I am thinking of adding a function that allows the user to optionally supply the memory to be used. > Well, I think these are my main thoughts so far... Again, I'm interested > in this and I would contribute to the creation of a directshow object > stream and a wrapper around MIL Lite 7.0. Any contribution you can make will be greatly appreciated. I know several people who have been looking for a way to read and write video files using DirectShow in VXL. Thank You, Matt Leotta Brown University |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-04 08:33:08
|
Matt Leotta wrote: > work primarily in Linux. I'll try to address your concerns to the > best of my ability and hopefully we can work out a common interface. > I'm not much of an expert either... so, I guess it would be nice to have the feedback of more experienced members in this area. > I agree. Some video feeds will have many parameters while others will > have very few. My hope was that we could always specify the > parameters during stream construction (or in some initialization > routine) and then simply request frames. This way we would have a > common interface to cameras and encoded video files. > I agree on this. I think all configuration should be done at construction and then just request video frames or work with the stream as needed. This would help keep a clean interface for the stream handling, without the clutter of many functions to adjust every possible parameter. >>The way I am thinking of it is to have live feeds (e.g., >>vidl2_cmu1394_istream, vidl2_v4l_istream, vidl2_directshow_istream) >>inherit from vidl2_istream and a parameters class hierarchy that >>inherits from vidl2_istream_params, which would encapsulate the >>parameters used for creation. > > > Absolutely. A parameters structure would be very useful for camera > streams when there are many parameters to work with. However, I'm not > so sure about the reason for a parameters class hierarchy. Is there > really a well defined hierarchy for parameters classes? What is the > advantage over simply defining a parameters class for each stream that > needs one? I'm really not sure... there are parameters that might overlap (e.g., size_x, size_y, mutltibuffering, multichannel, etc.), but I don't think this would be reason enough to have a hierarchy of classes for params. I think it boils down on how you want to configure the params class and create the stream. As Dr. McCane mentioned we would like to support stream creation as in the following for simple configurations: // not familiar with this, but assume is linux/v4l way... vidl_istream input("/dev/video0"); // for video files vidl_istream input("file.avi"); But for more complex configurations you should have a way of configuring the stream. The easiest approach in my opinion is to have a configuration file parsed to load the options into the params class and let the params create the stream using a factory method. In this case if there is a params hierarchy, then I don't need to know the type to instantiate at compilation time (create_params returns a vidl_istream_params instead of vidl_istream_dshow_params): vidl_istream_params *params = vidl_istream_params::create_params(config_file); vidl_istream *input = params->create_stream(); I can also see code like this coming in handy: // create a params object (type not known at compile time) vcl_string type = "DirectShow"; vidl_istream_params *params = vidl_istream_params::create_params(type); // the configure some settings, possibly through GUI menu items params->set_xxx(...); params->set_yyy(...); // then create the stream as: vidl_istream *input = params->create_stream(); // or vidl_istream input(params); These are ideas on how I think things could work, but I don't know what the standard way of doing things is... So, I guess one of the first tasks is to decide on what type of interface for stream creation we want to support. Whether params will be a hierarchy of classes will be determined by that. I guess it might be possible to design a templates based approach... >>2. My other concern is asynch vs synch acquisition. I think this is >>crucial for video capture applications in which for example multiple >>frame grabber boards are used and you would want to start acquiring from >>both "pseudo-"simultaneously and not wait until one has finished to be >>able to start the next. I think this should be part of the vidl2_istream >>interface in a similar way that seekable streams are handled. In this >>case, functions such as is_asynch_capable(), read_frame_asynch(), >>is_capture_finished(), etc.) > > > That's a good point, but maybe we can handle this with a more general > interface. If we had a is_image_ready() function in the base > vidl2_istream class it could act as is_capture_finished() but for all > input streams. This function would always return true for other types > of input streams. Then you could treat all input streams as asynch > capable and you also wouldn't need a separate read_frame_asynch() > function. Would this handle what you are talking about? I don't have > much experience in this area so let me know if I am missing the point. > > Of course, the other issue is synchronized collection. I haven't > looked into that much. > Well, I think it is nice to have a different member function for reading synch/asynch... this is to facilitate things for the user, which can get confused if he calls input.read_frame() and then starts working with the frame without knowing that it isn't finished. I think is_image_ready() works fine. And the safest thing would be to let read_frame() default to synchronous capture and have a read_frame_asynch() for asynch (it can of course do the same as read_frame() if the device doesn't support asynch or throw an exception). Also, note that synch capture can always be implemented as an asynch that waits for image to be ready: vil_image_resource_sptr read_frame(void) { vil_image_resource_sptr temp = read_frame_asynch(); while(!is_image_ready()); return temp; } The code for multiple asynch captures would look something like: input1.read_frame_asynch(); input2.read_frame_asynch(); input3.read_frame_asynch(); while ( !input1.is_image_ready() && !input2.is_image_ready() && !input3.is_image_ready() ); // process the frames > > >>3. Another topic, which I'm definetely not familiar with (but need to do >>the research on) is the writing to disk. In many live video processing >>applications you might not need to write to disk, while in others your >>sole need is to capture the stream to disk for offline processing. In >>this case I am assuming that the frame grabbers have mechanisms to >>directly transfer the acquired data from it's memory to the hard drive >>with minimum utilization of the CPU. In this case, there is a need to be >>able to connect a vidl2_istream with a vidl2_ostream with as little >>overhead as possible, specifically without the extra image copying. > > > I've also been thinking about this. Connecting a vidl2_istream to a > vidl2_ostream should be as simple as passing each > vil_image_resource_sptr from vidl2_istream.read_frame() to > vidl2_ostream.write_frame(). The smart pointer will prevent extra > copying of the image data. However, a bigger concern might be memory > allocation. The istream will need to allocate new memory for each > frame. The user will typically not want to overwrite the data from > the previous frame. Yet, in some cases this is exactly what we want > because it is faster and we no longer need the previous frame after it > is used. > > I am thinking of adding a function that allows the user to optionally > supply the memory to be used. In the case of video capture (and I think this applies to file input as well) you don't want to be allocating things per frame. I think you should have a set of buffers allocated (vector<vil_image_resource_sptr>) and cycle through it. For example in the case of double buffering you can use two buffers, while you process one you capture in the other. This can be handled in the vidl_istream class hidden from the user. The user can always do a deep copy of his smart pointer handle if he wishes not to let that frame be destroyed in the next acquisition. What concerns me is that the buffers that vidl_istream allocates will be in the heap, while I think we can have more efficient use of frame grabbers through their API (MIL Lite for Matrox for example) to collect to the memory in the frame grabber and copy from there directly to the hard drive. In this case I'm not sure we will be able to wrap this to a vil_image_resource_sptr. For example for MIL/Matrox the API is as follows: // over simplified sequence to give an idea... MdigGrab(dig_id, buff_id); // grabs the frame to device MbufGet(buff_id, (void *) image); // copy the buffer from device // to user supplied memory MbufExport("file.bmp", M_BMP, buff_id); // save buffer to disk dig_id and buff_id are integer identifiers that are used internally to identify previously allocated resources. The thing to note here is that if I want to pass the image from device to disk directly I don't need the MbufGet function call. So, a vidl_istream::read_frame() would usually wrap the MdigGrab and MbufGet calls and I would use the vidl_ostream::write_frame() to save the frame. But this is inefficient because I could have done this without the extra image copy (MbufGet) which is expensive... I think this is true for any device that can handle DMA memory (that is, you don't have a handle on the frame unless you copy it into vidl_istream supplied memory like MbufGet does), but I'm not sure. Did I explain my concerns correctly? Any ideas on how to handle such issues in an efficient way? > Any contribution you can make will be greatly appreciated. I know > several people who have been looking for a way to read and write video > files using DirectShow in VXL. Well, I am putting some thoughts together as you can see by this email and I will soon be submitting some code. I guess we need to refine the interface, but things are looking good. Thanks for taking this project up. IMHO, frame grabber/camera capture is really missing from the core VXL. --Miguel |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2005-12-29 18:49:48
|
Hello, It seems that the VXL Package Documentation site: http://paine.wiau.man.ac.uk/pub/doc_vxl/index.html has been down forever... (that is to say, over 2 weeks maybe?). I remember a proposal for a mirror or something. Well, I just wanted to know if something is being done, or if it is just my machine playing tricks on me... Thanks, Miguel |
From: Brendan M. <mc...@cs...> - 2006-01-03 20:55:11
|
Hi Matt, Sounds like a great idea. I have two main requirements for such a library (especially on the live capture side). First it should be simple to use and second it should be efficient. A lot of the time, what I'd like to do is specify a source (be it camera input, video file or whatever), and have the software figure out the capabilities and set parameters to defaults, so it just works. So here's what I'd like to do: vidl_istream input("/dev/video0"); while (++input!=NULL) // process frame OR vidl_istream input("~/testvideo.avi") while (++input!=NULL) // process frame or some similar/equivalent interface. Just to let you know, I currently have an intern working on a new improved oufgl. Currently it handles usb webcams on linux and windows, and hopefully will soon handle 1394 cameras as well. I think it should be relatively easy to refactor the new stuff into the library structure you are suggesting. Anyway, I'm happy to help out anyway I can. On Fri, 2005-12-23 at 12:27 -0500, Matt Leotta wrote: > Dear All, > > I would like to propose a new video library to ultimately replace > vidl. The current vidl library has been crudely ported multiple times > (once by me), and its aging design is starting to show. Here are a > few of the troubles with the current vidl library: > > 1) It is messy (from the crude porting) > 2) The API does not support setting encoding options other than file > format. > 3) It treats a video as an object. This requires you to have the > entire video in memory to save it to disk. > 4) Few codecs are actually implemented, and some (such as MPEG) are > very buggy. > 5) It does not support live video streams for video capture. > > It seems that many people have ignored this library in favor of > writing new code to meet their needs. Some examples of this are > contrib/mul/mvl2 - AVI file I/O on both windows and linux > contrib/brl/vvid - video capture in Windows using IEEE 1394 camera > drivers > contrib/oul/oufgl - video-4-linux framegrabber > > The new library I propose will treat video as an input or output > stream of images. You can open an input stream (which may be an > encoded video file, a sequence of images, a camera, etc.) and read > images one at a time. Similarly you can open an output stream and > write images one at a time. Some streams will support seeking and > others will not. We probably also need a mechanism to provide a time > stamp with each frame. > > I have already started to implement this design in > contrib/brl/bbas/vidl2. I have written stream classes for a list of > images (using vil and vul), and I am working on stream classes to > interface with FFMPEG to read most types of encoded files on Linux > (and possibly Windows). Before I get too far I would like to get some > feedback from those in the VXL community who work with video. I want > to make this design meet the needs of as many people as possible. > > Thank You and Happy Holidays, > Matt Leotta > Brown University > > -- Cheers, Brendan |
From: Matt L. <mat...@gm...> - 2006-01-04 20:16:28
|
On 1/4/06, Miguel A. Figueroa-Villanueva <mi...@ms...> wrote: > > Is there > > really a well defined hierarchy for parameters classes? What is the > > advantage over simply defining a parameters class for each stream that > > needs one? > > I'm really not sure... there are parameters that might overlap (e.g., > size_x, size_y, mutltibuffering, multichannel, etc.), but I don't think > this would be reason enough to have a hierarchy of classes for params. > > I think it boils down on how you want to configure the params class and > create the stream. It's true that some parameters will overlap, but I'm not sure that there are enough to warrant a hierarchy. I'm not in favor of having the parameters class act as a factory for streams, partially because many streams will not even need parameters other than file/device name. Instead I would rather see the user create the stream directly and specify a parameters struct to the constructor if needed. > As Dr. McCane mentioned we would like to support stream creation as in > the following for simple configurations: > > // not familiar with this, but assume is linux/v4l way... > vidl_istream input("/dev/video0"); > > // for video files > vidl_istream input("file.avi"); This is more what I had in mind, except I think the user should determine the type of stream like: vidl2_v4l_istream input("/dev/video0", params); or vidl2_directshow_istream input("file.avi"); I'm not very confident that we want the vidl2 code to automatically determine the type of stream to use (as is done in vil_load). There could be more than one available stream that works (such as Directshow and FFMPEG). > But for more complex configurations you should have a way of configuring > the stream. The easiest approach in my opinion is to have a > configuration file parsed to load the options into the params class and > let the params create the stream using a factory method. In this case if > there is a params hierarchy, then I don't need to know the type to > instantiate at compilation time (create_params returns a > vidl_istream_params instead of vidl_istream_dshow_params): Do you have an application for this? That is, where you don't know what type of stream you want to use until you parse a configuration file? We could certainly do this, it just seems much more common that you would know what type of stream you want before you try to load a configuration file. > vidl_istream_params *params =3D > vidl_istream_params::create_params(config_file); > vidl_istream *input =3D params->create_stream(); > > I can also see code like this coming in handy: > > // create a params object (type not known at compile time) > vcl_string type =3D "DirectShow"; > vidl_istream_params *params =3D > vidl_istream_params::create_params(type); > // the configure some settings, possibly through GUI menu items > params->set_xxx(...); > params->set_yyy(...); This only works for the parameters in the base class. You would have to downcast to set the other parameters which seems to defeat the purpose. > // then create the stream as: > vidl_istream *input =3D params->create_stream(); > // or > vidl_istream input(params); > > These are ideas on how I think things could work, but I don't know what > the standard way of doing things is... So, I guess one of the first > tasks is to decide on what type of interface for stream creation we want > to support. Whether params will be a hierarchy of classes will be > determined by that. I guess it might be possible to design a templates > based approach... > Well, I think it is nice to have a different member function for reading > synch/asynch... this is to facilitate things for the user, which can get > confused if he calls input.read_frame() and then starts working with the > frame without knowing that it isn't finished. > > I think is_image_ready() works fine. And the safest thing would be to > let read_frame() default to synchronous capture and have a > read_frame_asynch() for asynch (it can of course do the same as > read_frame() if the device doesn't support asynch or throw an > exception). Also, note that synch capture can always be implemented as > an asynch that waits for image to be ready: > > vil_image_resource_sptr read_frame(void) > { > vil_image_resource_sptr temp =3D read_frame_asynch(); > while(!is_image_ready()); > return temp; > } > > The code for multiple asynch captures would look something like: > > input1.read_frame_asynch(); > input2.read_frame_asynch(); > input3.read_frame_asynch(); > > while ( !input1.is_image_ready() > && !input2.is_image_ready() > && !input3.is_image_ready() ); > > // process the frames Okay, I see what you mean now, but I'm still in favor of having a single read_frame() function if possible. I think it would be more confusing having two even if they both did the same thing for other types of streams. I was actually thinking the code would go more like: while ( !input1.is_image_ready() && !input2.is_image_ready() && !input3.is_image_ready() ); input1.read_frame(); input2.read_frame(); input3.read_frame(); Where read_frame does not return until the image is ready (but here we've checked in advance to see that it is). This way users could choose to ignore the use of is_image_ready() if they don't care about reading asynch. Either way nobody gets stuck with an image that is not ready. Might we also need a trigger() function to make this work? Would it be possible to start the frame grab in the first call to is_image_ready() ? > In the case of video capture (and I think this applies to file input as > well) you don't want to be allocating things per frame. I think you > should have a set of buffers allocated (vector<vil_image_resource_sptr>) > and cycle through it. For example in the case of double buffering you > can use two buffers, while you process one you capture in the other. > This can be handled in the vidl_istream class hidden from the user. The > user can always do a deep copy of his smart pointer handle if he wishes > not to let that frame be destroyed in the next acquisition. Actually, I'm in favor of having the stream allocate new memory for each image that is requested as the default action. I don't think that users should have to worry about doing an explicit deep copy to prevent their image from being over written. That said, for any performance driven application this is a very bad idea. I was thinking we could allow the user to optionally supply a memory pool for the stream. This way you can recycle memory when you are done with it. Maybe something like: vidl2_istream.recycle(vil_memory_chuck_sptr); or vidl2_istream.recycle(vil_image_resource_sptr); This would add the images memory to a queue of buffers to reuse. If the queue becomes empty new memory is allocated. So we might do this: while(istream.is_valid()) { vil_image_resource_sptr image =3D istream.read_frame(); // do something with the image istream.recycle(image); } > What concerns me is that the buffers that vidl_istream allocates will be > in the heap, while I think we can have more efficient use of frame > grabbers through their API (MIL Lite for Matrox for example) to collect > to the memory in the frame grabber and copy from there directly to the > hard drive. In this case I'm not sure we will be able to wrap this to a > vil_image_resource_sptr. > > For example for MIL/Matrox the API is as follows: > > // over simplified sequence to give an idea... > MdigGrab(dig_id, buff_id); // grabs the frame to device > MbufGet(buff_id, (void *) image); // copy the buffer from device > // to user supplied memory > MbufExport("file.bmp", M_BMP, buff_id); // save buffer to disk > > dig_id and buff_id are integer identifiers that are used internally to > identify previously allocated resources. > > The thing to note here is that if I want to pass the image from device > to disk directly I don't need the MbufGet function call. So, a > vidl_istream::read_frame() would usually wrap the MdigGrab and MbufGet > calls and I would use the vidl_ostream::write_frame() to save the frame. > But this is inefficient because I could have done this without the extra > image copy (MbufGet) which is expensive... I think this is true for any > device that can handle DMA memory (that is, you don't have a handle on > the frame unless you copy it into vidl_istream supplied memory like > MbufGet does), but I'm not sure. This shouldn't be a problem. The magic of using an image_resource_sptr is that it doesn't have to be an in-memory image.=20 In fact, the vidl2_image_list_istream that I've already written can return resources to images on disk. The image is not actually loaded into memory until you call vil_image_resource_sptr->get_view(). So you should be able to use this stream design to do exactly what you want to do. Example: 1) create a new type of resource, say vidl2_mil_lite_resource. This resource would store the buff_id and use MbufGet as part of its get_view() function. 2) create a vidl2_mil_lite_istream that calls MdigGrab and returns a vidl2_mil_lite_resource_sptr 3) create a vidl2_mil_lite_ostream that uses MbufExport to save the image without ever calling get_view(). Maybe this would be derived from vidl_image_list_ostream I think we making progress here. I plan to check in a vidl2_ffmpeg_ostream sometime soon that uses a very large parameter structure. Feel free to contribute code at any time. --Matt |
From: Brendan M. <mc...@cs...> - 2006-01-04 21:24:33
|
On Wed, 2006-01-04 at 15:16 -0500, Matt Leotta wrote: > On 1/4/06, Miguel A. Figueroa-Villanueva <mi...@ms...> wrote: > > As Dr. McCane mentioned we would like to support stream creation as i= n > > the following for simple configurations: > > > > // not familiar with this, but assume is linux/v4l way... > > vidl_istream input("/dev/video0"); > > > > // for video files > > vidl_istream input("file.avi"); >=20 > This is more what I had in mind, except I think the user should > determine the type of stream like: >=20 > vidl2_v4l_istream input("/dev/video0", params); > or > vidl2_directshow_istream input("file.avi"); >=20 > I'm not very confident that we want the vidl2 code to automatically > determine the type of stream to use (as is done in vil_load). There > could be more than one available stream that works (such as Directshow > and FFMPEG). I agree that it may be difficult or even impossible, but it would still be nice from a users point of view. As a minimum I would think those streams should inherit from a base class so I can say: vidl2_istream *input =3D new vidl2_v4l_istream("/dev/video0", params); But even this is not quite satisfactory because I don't want to have to specify explicitly that it's a v4l stream if I can avoid it. In a similar way that if I use vgui, I don't have to specify a qt_widget or a mfc_widget if I don't want to. It may be that a similar mechanism of registering toolkits is required, but I don't have a good idea about how to do this as yet. > > Well, I think it is nice to have a different member function for read= ing > > synch/asynch... this is to facilitate things for the user, which can = get > > confused if he calls input.read_frame() and then starts working with = the > > frame without knowing that it isn't finished. > > > > I think is_image_ready() works fine. And the safest thing would be to > > let read_frame() default to synchronous capture and have a > > read_frame_asynch() for asynch (it can of course do the same as > > read_frame() if the device doesn't support asynch or throw an > > exception). Also, note that synch capture can always be implemented a= s > > an asynch that waits for image to be ready: > > > > vil_image_resource_sptr read_frame(void) > > { > > vil_image_resource_sptr temp =3D read_frame_asynch(); > > while(!is_image_ready()); > > return temp; > > } > > > > The code for multiple asynch captures would look something like: > > > > input1.read_frame_asynch(); > > input2.read_frame_asynch(); > > input3.read_frame_asynch(); > > > > while ( !input1.is_image_ready() > > && !input2.is_image_ready() > > && !input3.is_image_ready() ); > > > > // process the frames >=20 > Okay, I see what you mean now, but I'm still in favor of having a > single read_frame() function if possible. I think it would be more > confusing having two even if they both did the same thing for other > types of streams. I was actually thinking the code would go more > like: >=20 > while ( !input1.is_image_ready() > && !input2.is_image_ready() > && !input3.is_image_ready() ); >=20 > input1.read_frame(); > input2.read_frame(); > input3.read_frame(); >=20 > Where read_frame does not return until the image is ready (but here > we've checked in advance to see that it is). This way users could > choose to ignore the use of is_image_ready() if they don't care about > reading asynch. Either way nobody gets stuck with an image that is > not ready. Might we also need a trigger() function to make this work? > Would it be possible to start the frame grab in the first call to > is_image_ready() ? I'm confused by your solution here Matt. How would asynch capture happen? I'd be in favour of Miguel's idea here - it seems fairly clear what is going on, and if read_frame() is the synchronous version, that that still makes it easy for simple applications. And then users won't use read_frame_asynch() unless they know what they're doing ... > Actually, I'm in favor of having the stream allocate new memory for > each image that is requested as the default action. I don't think > that users should have to worry about doing an explicit deep copy to > prevent their image from being over written. >=20 > That said, for any performance driven application this is a very bad > idea. I was thinking we could allow the user to optionally supply a > memory pool for the stream. This way you can recycle memory when you > are done with it. Maybe something like: >=20 > vidl2_istream.recycle(vil_memory_chuck_sptr); >=20 > or >=20 > vidl2_istream.recycle(vil_image_resource_sptr); >=20 > This would add the images memory to a queue of buffers to reuse. If > the queue becomes empty new memory is allocated. So we might do this: >=20 > while(istream.is_valid()) > { > vil_image_resource_sptr image =3D istream.read_frame(); >=20 > // do something with the image >=20 > istream.recycle(image); > } This is a neat idea, but I think it's dangerous to allocate new frames by default. If you're capturing a 640x480 image, that's 1M. At 30 frames/sec that's only 30s of capture before most machines start creaking/falling over. How about just specifying the number of queued frames a priori?=20 >=20 >=20 > > What concerns me is that the buffers that vidl_istream allocates will= be > > in the heap, while I think we can have more efficient use of frame > > grabbers through their API (MIL Lite for Matrox for example) to colle= ct > > to the memory in the frame grabber and copy from there directly to th= e > > hard drive. In this case I'm not sure we will be able to wrap this to= a > > vil_image_resource_sptr. > > > > For example for MIL/Matrox the API is as follows: > > > > // over simplified sequence to give an idea... > > MdigGrab(dig_id, buff_id); // grabs the frame to device > > MbufGet(buff_id, (void *) image); // copy the buffer from devic= e > > // to user supplied memory > > MbufExport("file.bmp", M_BMP, buff_id); // save buffer to disk > > > > dig_id and buff_id are integer identifiers that are used internally t= o > > identify previously allocated resources. > > > > The thing to note here is that if I want to pass the image from devic= e > > to disk directly I don't need the MbufGet function call. So, a > > vidl_istream::read_frame() would usually wrap the MdigGrab and MbufGe= t > > calls and I would use the vidl_ostream::write_frame() to save the fra= me. > > But this is inefficient because I could have done this without the ex= tra > > image copy (MbufGet) which is expensive... I think this is true for a= ny > > device that can handle DMA memory (that is, you don't have a handle o= n > > the frame unless you copy it into vidl_istream supplied memory like > > MbufGet does), but I'm not sure. >=20 > This shouldn't be a problem. The magic of using an > image_resource_sptr is that it doesn't have to be an in-memory image.=20 > In fact, the vidl2_image_list_istream that I've already written can > return resources to images on disk. The image is not actually loaded > into memory until you call vil_image_resource_sptr->get_view(). So > you should be able to use this stream design to do exactly what you > want to do. Example: >=20 > 1) create a new type of resource, say vidl2_mil_lite_resource. This > resource would store the buff_id and use MbufGet as part of its > get_view() function. >=20 > 2) create a vidl2_mil_lite_istream that calls MdigGrab and returns a > vidl2_mil_lite_resource_sptr >=20 > 3) create a vidl2_mil_lite_ostream that uses MbufExport to save the > image without ever calling get_view(). Maybe this would be derived > from vidl_image_list_ostream Hmmm. I think perhaps using an mmap might make this easier? >=20 >=20 > I think we making progress here. I plan to check in a > vidl2_ffmpeg_ostream sometime soon that uses a very large parameter > structure. Feel free to contribute code at any time. >=20 > --Matt >=20 >=20 > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log = files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_idv37&alloc_id=16865&op=C3=8Ck > _______________________________________________ > Vxl-maintainers mailing list > Vxl...@li... > https://lists.sourceforge.net/lists/listinfo/vxl-maintainers >=20 --=20 Cheers, Brendan |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-05 01:23:07
|
I guess you guys are too fast for me... I've been trying to reply to Matt's email, only to be interrupted by incoming messages on the subject :) I guess, I like the fact that the feedback is flowing! Amitha Perera wrote: > 1. Parameter blocks. > > First, a class should not inherit from it's parameter > block. Inheritance should be a is-a relationship, and that does not > apply. The cost is simply a five character prefix: > struct object_params { > int p1, p2; > }; > class object { > object_params prms_; > public: > object( object_params const& p ) : prms_(p) { } > > int foo() { return prms_.p1 + prms_.p2; } > // instead of > // int foo() { return p1 + p2; } > }; I favor object composition as well. > Parameter blocks should have (clearly documented) default values, and > set methods to change them. > > struct video_out_params { > //: The width of the image (default 320). > int width_; > > //: The height of the image (default 240). > int height_; > > video_out_params() > : width_( 320 ), > height_( 240 ) > { } > > //: \sa width > video_out_params& set_width( int v ) { width_ = v; return *this; } > > //: \sa height > video_out_params& set_height( int v ) { height_ = v; return *this; } > }; > > Using the parameter block is then something like > > video_out_params p; > p.set_width( 640 ); > video_out ostr( p ); > > or even > > video_out ostr( video_out_params() > .set_width( 640 ) > .set_height( 320 ) ); This is similar to the setup I have, except (as noted in my discussion with Matt) that object_params is a hierarchy of classes. Now, I think having one parameter block for all types of camera/frame grabber streams is nice, but can it accomodate all of them without cluttering the use for other devices? For example, for matrox I have options like the following that might not be supported in other devices: M_GRAB_SCALE: M_FILL_DESTINATION M_GRAB_START_MODE: M_FIELD_START M_GRAB_FIELD_NUM: 1 M_GRAB_AUTOMATIC_INPUT_GAIN: M_DISABLE M_CAMERA_LOCK: M_ENABLE That is why I thought about a params hierarchy that would abstract common parameters in a base class and leave obscure device-specific parameter handling in a subclass. Addressing Matt's comment: >>// then configure some settings, possibly through GUI menu items >>params->set_xxx(...); >>params->set_yyy(...); > > > This only works for the parameters in the base class. You would have > to downcast to set the other parameters which seems to defeat the > purpose. I should have been more clear... You can have something like: params->set_property(tag, value); // virtual function if the tag is not recognized it can return false or throw an exception. In summary the question that remains is, can one parameter block class accomodate all of the stream devices without cluttering the use for some devices? > 2. Factory classes > > The way to get a run-time configurable input or output stream would > be to use a factory class _for_the_whole_stream_. > > vidl2_istream_sptr stream = vidl2_create_stream_from_xml( "config.xml" ); > > The factory would then understand the available streams, and delegate > to other classes to actually create the appropriate stream objects. > > Run-time changes of the input source is actually a very useful > thing. Imagine, for example, > <reader>ffmpeg</reader> > <filename>test.avi</filename> > vs > <reader>directshow</reader> > <filename>test.avi</filename> > which allows the same file to be read via different mechanisms. That > allows the user to work around known shortcomings in the various > reader implementations. This clearly explains why I don't like having to fix the stream at compile time. Addressing Matt's comment: >>I'm not very confident that we want the vidl2 code to automatically >>determine the type of stream to use (as is done in vil_load). There >>could be more than one available stream that works (such as Directshow >>and FFMPEG). While I think that we shouldn't require to fix the stream type at compile time, I'm not sure we want to let the vidl2 code figure out the type automatically either. This is because, as Matt mentions, multiple toolkits could handle it. Although one could argue towards letting the first in the list of toolkits handle it by default... BTW, how do you specify a device in windows (alternative to "/dev/video0")? But, more importantly it might take a lot of effort to probe a toolkit to see wether it can handle it (not for video files, but for frame grabbers and cameras). For example, in DirectShow you would need to allocate two enumerator COM objects to get to the first available device. I don't know how fast this would be... and don't know how many times the default would be the one the user wants. > 3. Memory allocation > > I think the default behaviour of read_frame should be to give access > to an image that is valid until the next call to read_frame. This > allows a framegrabber to create an image_view that simply wraps the > framegrabber's memory, if that is possible. Providing a pre-allocated > memory segment does not allow for this. > > If the user wishes to keep the image around for longer, they would > have to call deep_copy on that. If the efficiency of that becomes a > question, then a method read_frame_copy() could generate a image_view > that would not be subsequently modified by the video reader, and the > video reader could implement this as efficiently as possible. > > I think that video users should be aware of the efficiencies > associated with memory allocation, and should consciously choose to > do a deep copy. I agree here. But I think there should be an option to establish a pipeline in the stream (multi-buffering), so that read_frame should give access to an image that is valid until the next N calls to read_frame. For example: stream.set_pipeline(3); // or set_n_buffers or whatever else vil_image_resource_sptr img[3]; img[0] = stream.read_frame(); img[1] = stream.read_frame(); int i = 1; while (++i) { img[i%3] = stream.read_frame(); process_background_subtraction(img[(i-2)%3], img[(i-1)%3], img[i%3]); } Note that here when I set_pipeline(3) I allocate 3 images in the vidl2_istream class and no more allocations are needed. > 4. Asynchronous capture > > In general, I prefer two methods for getting images: advance() and > current_frame(). The simple loop would then be something like > while( stream.advance() ) { > process( stream.current_frame() ); > } > > For async capture, advance() could be divided into advance_start() > (non-blocking), advance_wait() (blocking), is_frame_available() > (non-blocking). > > Scenario 1: > > while( stream.advance_wait() ) { > img = stream.current_frame().deep_copy(); > stream.advance_start(); > do other processing > } > > Scenario 2: > > while( stream.advance_wait() ) { > img = stream.current_frame().deep_copy(); > stream.advance_start(); > while( ! stream.is_frame_available() ) { > do idle processing > } > } If I understand things correctly, stream.advance_wait() is unnecessary in Scenario 2, right? That is because I checked the is_frame_available before... And why the deep_copy... Am I to assume that as soon as I call advance_start, current_frame is not valid. This would be another reason for the multi-buffering as explained in the previuous item. Then this would work: Scenario 2: stream.set_pipeline(2); while (stream.advance_wait()) { img = stream.current_frame(); // no deep_copy stream.advance_start(); process(img); } --Miguel |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-05 04:54:40
|
Dr. Perera, I'm kind of lost here, so please bear with me. I think I'm agreeing with you, but I can't seem to put together a few pieces of the puzzle... Amitha Perera wrote: > To my comments on memory allocation, Miguel wrote: > >>I agree here. But I think there should be an option to establish a >>pipeline in the stream (multi-buffering), so that read_frame should give >>access to an image that is valid until the next N calls to >>read_frame. > > > I'm a firm believer in simple APIs and one process doing one > thing. For buffered images, one could simply write a > vidl2_buffered_stream that takes as input any vidl2_istream and > buffers the results. Then the logic to maintain the buffers doesn't > need to be provided by each and every istream. > > However, this requires a mechanism for the buffer to update when a > new image is available. At GE, we are using a pipeline mechanism, > where we call the stages of the pipeline "processes". A "step" on a > process executes the process. So, you example would look something > like > > frame_process fp( /* create a istream process */ ); > > buffered_frame_process bp; > bp.set_length( 3 ); > bp.set_frame_process( &fp ); > > bg_subtraction_process bgsp; > bgsp.set_window( 3 ); > bgsp.set_buffered_frame_process( &bp ); > > while( fp.step() && > bp.step() && > bgsp.step() ) { > // do something with results of bgsp > } The trick for this to work efficiently is having the method you refer to below as 'current_frame_copy', right? That is, it is the job of vidl2_buffered_stream to call this method and hold on to the images. > A method like "current_frame_copy", with semantics like "I need a > static image; give me one as efficiently as you can" would serve this > purpose. The stream could do the deep copy only if > necessary. What is not clear to me is that in the case when the framegrabber creates an image_view that simply wraps the framegrabber's memory... how do I efficiently implement the 'current_frame_copy'? Am I forced to do a deep_copy? I'm assuming here that the framegrabber has on-board memory for more than one image, but of course that it runs out of memory much quicker than the host computer. So that in the set_pipeline approach the buffer can probably only support a small depth. Again, I may very well be missing a few points on how the vil_image_resources work... so thanks for your patience. --Miguel |
From: Brendan M. <mc...@cs...> - 2006-01-05 20:30:44
|
On Thu, 2006-01-05 at 10:31 -0500, Amitha Perera wrote: > > I'm assuming here that the framegrabber has on-board memory for more > > than one image, but of course that it runs out of memory much quicker > > than the host computer. So that in the set_pipeline approach the buffer > > can probably only support a small depth. > > I'm showing my ignorance of framegrabbers now: do framegrabbers > generally have multiple image storage that we can wrap individually? Usually they have at least 2 I think (one to acquire to, and one to read from - even usb cams often seem to have this capability - or at least the linux drivers for usb cams seem to offer it). > > What you are suggesting implies that the framegrabber could > efficienctly give you static images, but only for a limited number of > acquisitions. A buffering scheme can then efficiently buffer upto > that many images, but needs to deep copy after that. Clearly, this is > much more complicated to handle than assuming that the framegrabber > can only store one image at a time. > > However, even if this is the case, I'd argue that the logic to > generally deal with this should lie in the buffered_image_process > process. The user specifies the length of the buffer he wants, and > the process does it as efficiently as possible, perhaps by querying > the input source into it's internal buffer characteristics. The user > then does not need to worry about the specifics of the video source. I agree. I think having a buffered_image_process is a good idea. Although the usual method of asynchronous capture involves starting the next acquire before reading from the latest image. So I would advocate that the simple framegrabber at least use 2 frames if it can - this is just double buffering really. > > A few questions that come up: > - do framegrabbers have this capability in general? I would hazard a guess that most would have double buffering. > - the cost savings, in general, would be the memcpy of, say, 1MB of > data. Is that so expensive. (The buffered process can preallocate > it's image buffers.) Good question. I just did a quick test. On a P4 2.8GHz machine, it takes approx 0.0009 sec to memcpy 1MB of memory. So if you want to process at 30frames/sec, the memcpy will take up approx 3% of the available time. > > Amitha. -- Cheers, Brendan |
From: Matt L. <mat...@gm...> - 2006-01-05 21:27:00
|
I've just checked in a vidl2_ffmpeg_ostream class which seems to be working= . So far all the implemented streams deal with data on disk. It shouldn't be too difficult to adjust the API once we finally decide how to handle async capture and other issues. I was planning on next working on an input stream for ieee1394 cameras in linux using libdc1394. I have no experience with libdc1394, so I will hold off for now if someone else is more familiar with it and would like to work on this one. It seems like it would be good to have some working capture code to experiment with some of these memory and efficiency issues. I just happen to have firewire cameras and I use linux so this is the obvious choice for me. Also, it seem that some sort of time-stamp mechanism might be useful, especially if we have to drop frames during capture. Any thoughts on this? Thanks, Matt Leotta |
From: Matt L. <mat...@gm...> - 2006-01-06 16:06:16
|
Miguel, I tried doing this several ways. Your way is probably more correct and is what I had originally. The problem I had was the index (unsigned int) is initialized to zero and then incremented to one with the first call of advance(). So by calling read_frame() in a loop or while(istream.advance()){ image =3D istream.current_frame(); ... } the first frame (with index zero) is skipped. With the current code, read_frame() effectively returns the current frame and then advances.=20 So the index is shifted off by one in a sense. This is messy so I think I'll go back to your way. But this brings up the question of initial state. When I open an istream should it automatically advance to the first frame or should current_frame() be invalid until the user calls advance()? --Matt On 1/6/06, Miguel A. Figueroa-Villanueva <mi...@ie...> wrote: > Matt, > > I was browsing the file vidl2_image_list_istream.cxx to understand what > advance, read_frame, and current_frame do, and found the following: > > 1. shouldn't index_ in advance() have a pre-increment instead of > post-increment?: > > bool vidl2_image_list_istream::advance() > { > return ++index_ < images_.size(); // not index++ > } > > 2. another error here, right? > > vil_image_resource_sptr vidl2_image_list_istream::read_frame() > { > if(is_valid()) // checking if current one is valid > return images_[++index_]; // but returning next, which might not be! > else > return NULL; > } > > should be: > > vil_image_resource_sptr vidl2_image_list_istream::read_frame() > { > advance(); // no need to check output, > current_frame(); // current_frame will check is_valid() > } > > Anyway, hope this helps. > > --Miguel > > |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-06 17:36:37
|
Matt Leotta wrote: > Miguel, > I tried doing this several ways. Your way is probably more correct > and is what I had originally. The problem I had was the index > (unsigned int) is initialized to zero and then incremented to one with > the first call of advance(). So by calling read_frame() in a loop or > > while(istream.advance()){ > image = istream.current_frame(); > ... > } > > the first frame (with index zero) is skipped. With the current code, > read_frame() effectively returns the current frame and then advances. > So the index is shifted off by one in a sense. This is messy so I > think I'll go back to your way. But this brings up the question of > initial state. When I open an istream should it automatically advance > to the first frame or should current_frame() be invalid until the user > calls advance()? 1. To summarize here, I think the semantics of advance and current_frame as they have been discussed are: a. advance: move to next frame without loading or getting the view from resource in the case of video or non-live streams. In the case of live feeds I guess it is inevitable to load the image (to the internal buffer), because we're telling it to fill the buffer with new data... Also, return false if it couldn't advance. bool advance() { if (index_ + 1 < images_.size()) { index_++; return true; } else return true; } bool advance() { // works, but sends index_ into oblivion // and next call to current frame fails return ++index_ < images_.size(); } b. current_frame: return a resource or a view to the current position (i.e., get the frame) return_type? current_frame { return images_[index_]; // if index is allowed to be out of range // then check for it and return NULL } c. read_frame (and variants such as read_frame_copy()): advance and get the frame return_type? read_frame() { advance(); return current_frame(); } 2. Regarding initial state: If the index_ isn't allowed to be out of range (this assumes images_.size() is at least one), then initialize index_ = 0 and expect the user to do something like: do { image = istream.current_frame(); } while(advance()); Another option is to allow index_ to be out of range and initialize it to -1 (no more unsigned int) and expect the user to do something like: while(advance()) { image = istream.current_frame(); } I don't have a preference, but I suspect that the second one is what most people want or assume. --Miguel |
From: Amitha P. <pe...@cs...> - 2006-01-08 18:15:12
|
On Fri 06 Jan 2006, Miguel A. Figueroa-Villanueva wrote: > Another option is to allow index_ to be out of range and initialize it > to -1 (no more unsigned int) [...] For the record, initializing the counter to unsigned(-1) is well-defined, and works as expected: unsigned(-1) + unsigned(1) == unsigned(0), and is guaranteed by the standard. Of course, since unsigned follows modulo arithmetic, unsigned(-1) > unsigned(0), which implies more care in dealing with the value. I personally prefer unsigned values, but others have strong opposing opinons. Amitha. |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-08 20:33:09
|
Amitha Perera wrote: >>Another option is to allow index_ to be out of range and initialize it >>to -1 (no more unsigned int) [...] > > > For the record, initializing the counter to unsigned(-1) is > well-defined, and works as expected: unsigned(-1) + unsigned(1) == > unsigned(0), and is guaranteed by the standard. > > Of course, since unsigned follows modulo arithmetic, > unsigned(-1) > unsigned(0), which implies more care in dealing with > the value. That's a good point, for some reason I didn't think of it that way... But as stated above, then is_valid gets trickier because as it is it will return true for this initial frame. I guess you can do one of the following: 1. Use an assert in current_frame to detect the -1. Note that if we have reached this index_ legally, then it is time to use a larger data type like unsigned long. vil_image_resource_sptr vidl2_image_list_istream::current_frame() { assert(static_cast<int>(index_) != -1); // only works if in debug mode // no performace hit issue if(is_valid()) return images_[index_]; else return NULL; } 2. Or check it in is_valid: virtual bool is_valid() const { return is_open() && index_ < images_.size() && static_cast<int>(index_) != -1; // most of the time this is // shorted out and won't incurr in a performance hit, right? } > I personally prefer unsigned values, but others have strong opposing > opinons. I guess that if all my assumptions above are correct, then I also prefer the unsigned values. In this case, we are only sacrificing the largest integer available to use as a flag. Which brings along my other question... In some platforms an int might be 16 bits (is this still true in practice?, I think in most platforms it is 32 bits). 2^16 / 30 frames/sec / 60 sec/min ~= 36 min That means that in those platforms we are supporting videos of under 37 min. So the question arises, shouldn't we use unsigned long instead? --Miguel |
From: Peter V. <pet...@ya...> - 2006-01-09 08:30:19
|
> So the question arises, shouldn't we use unsigned long instead? Or maybe rather vxl_uint_32, since "unsigned long" might be 64-bit on some platforms (or am I mistaken here?) -- Peter. |
From: Ian S. <ian...@st...> - 2006-01-09 09:21:22
|
Miguel A. Figueroa-Villanueva wrote: > Amitha Perera wrote: I prefer unsigned in such APIs - the fact that the library uses -1 internally is an implementation detail and doesn't need to be exposed. That the API uses unsigned tells me at a glance some useful properties about that API. > Which brings along my other question... In some platforms an int might > be 16 bits (is this still true in practice?, I think in most platforms > it is 32 bits). > > 2^16 / 30 frames/sec / 60 sec/min ~= 36 min > > That means that in those platforms we are supporting videos of under 37 > min. So the question arises, shouldn't we use unsigned long instead? and Peter Vanroose wrote: > Or maybe rather vxl_uint_32, since "unsigned long" might be 64-bit on > some platforms (or am I mistaken here?) There are two standard answers to this question. Typically, VXL just picks unsigned long or unsigned int for a particular application. The fact that it is different sizes on different platforms should be irrelevant in most cases. If you want to error check that you aren't about to overflow the type that is fine, but I don't see any need for the length of supported movies to be platform invariant. If someone wants to do some serious computer vision on a 16-bit platform then they will be aware of many limitations. (no 66000 element long vectors, etc.) This is the C (and C++) way. The only time it is appropriate to used fixed length types is when we are constrained by some external standard - e.g. when loading a 32-bit image, the image pixel type needs to be fixed 32-bit length for anything to work. The alternative STL way is to use vcl_size_t. Note that this is still a platform-defined length, but most counting types (e.g. container lengths, etc.) will be this length. I'd personally go with size_t, but then if I had the time - I'd like to replace vnl_vector's size() type (and all the rest) with vcl_size_t. Ian. |
From: Amitha P. <pe...@cs...> - 2006-01-09 13:59:24
|
On Sun 08 Jan 2006, Miguel A. Figueroa-Villanueva wrote: > Amitha Perera wrote: > vil_image_resource_sptr vidl2_image_list_istream::current_frame() > { > assert(static_cast<int>(index_) != -1); // only works if in debug mode > // no performace hit issue [...] > virtual bool is_valid() const > { > return is_open() > && index_ < images_.size() > && static_cast<int>(index_) != -1; // most of the time this is > // shorted out and won't incurr in a performance hit, > right? In neither case is the cast required. Since index_ is unsigned, and unsigned(-1) == MAX_UINT, the checks can be conveniently written as assert( index_ < images_.size() ) and return is_open() && index_ < images_.size() The inequality checks for both one-past-the-end and for uninitialized. Instead of thinking as -1, maybe it is useful to think of the sentinel value as MAX_UINT, keeping in mind that C++ guarantees that MAX_UINT+1 == 0. > That means that in those platforms we are supporting videos of under 37 > min. So the question arises, shouldn't we use unsigned long instead? I think that's reasonable. Video frame numbers can get very large. (Even thought, IIRC, the standard does not guarantee that long is 32-bit; it merely guarantees long is at least the same width as int, or wider.) Amitha. |
From: Amitha P. <pe...@cs...> - 2006-01-08 18:16:37
|
On Fri 06 Jan 2006, Miguel A. Figueroa-Villanueva wrote: > 2. Regarding initial state: > > If the index_ isn't allowed to be out of range (this assumes > images_.size() is at least one), then initialize index_ = 0 and expect > the user to do something like: > > do { > image = istream.current_frame(); > } while(advance()); > > Another option is to allow index_ to be out of range and initialize it > to -1 (no more unsigned int) and expect the user to do something like: > > while(advance()) { > image = istream.current_frame(); > } > > I don't have a preference, but I suspect that the second one is what > most people want or assume. I prefer the second. Amitha. |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-06 12:21:11
|
Brendan McCane wrote: >>I'm showing my ignorance of framegrabbers now: do framegrabbers >>generally have multiple image storage that we can wrap individually? > > > Usually they have at least 2 I think (one to acquire to, and one to read > from - even usb cams often seem to have this capability - or at least > the linux drivers for usb cams seem to offer it). > I am only familiar with an old Matrox Meteor II frame grabber which has 4MB of on-board memory. I did a quick search and it seems that these days many have 64MB and over (specially the ones targetting multicamera setups). Now, in the meteor-II I can allocate several buffers with different attributes... if the attributes include MIL_GRAB, then it is allocated in "physically contiguous and always present memory (that is, non-paged memory)", whatever that means... I think it is DMA, right? The thing that I am certain about is that I can allocate more than one buffer, which have different MIL_ID's and using this identifier, I can grab to the different buffers. With this I can implement a multi-buffer pipeline. It gets a little more complicated, the Meteor-II board also supports multi-channel input. Meaning that I can have several cameras hooked to the same frame grabber and sequentially switch channel and capture to a different buffer, but also they can grab to the same one. I have also seen specs on the web (i.e., no experience with them) for frame grabbers that allow simultaneous input from multiple cameras (in the Meteor-II you have to switch channels, so there is a switch delay of about 30ms). Others also support on-board compression (e.g., 4 JPEG encoders on-board) to reduce traffic in the PCI Bus, and on-board vision processing, etc. Now, I understand that we don't want to support every permutation of use cases in the vidl2_istream interface. Maybe that would be the task of say a frame_grabber object. The problem is that I can only speculate how people, particularly VXL'ers, use frame grabbers because I only know how I use them... In my case, I want to capture streaming images from several cameras directly to disk for offline processing. But I think that multi-buffering, particularly double-buffering, is very common practice to make efficient use of resources. I also believe in a clean and simple API. So, the above information is to give some idea on how frame grabbers work, "speculate" how people are likely to use them, and determine how far we want to support them in the vidl2 library. >>However, even if this is the case, I'd argue that the logic to >>generally deal with this should lie in the buffered_image_process >>process. The user specifies the length of the buffer he wants, and >>the process does it as efficiently as possible, perhaps by querying >>the input source into it's internal buffer characteristics. The user >>then does not need to worry about the specifics of the video source. > > > I agree. I think having a buffered_image_process is a good idea. > Although the usual method of asynchronous capture involves starting the > next acquire before reading from the latest image. So I would advocate > that the simple framegrabber at least use 2 frames if it can - this is > just double buffering really. I agree here on both. The buffered_image_process idea seems good as long as it can talk to the vidl_istream to set things up efficiently (from the info below it might be that efficiently might be doing the deep_copy after all). However, I also agree that at least double buffering should be supported without the need to bring in to play the buffered_image_process. >>- the cost savings, in general, would be the memcpy of, say, 1MB of >> data. Is that so expensive. (The buffered process can preallocate >> it's image buffers.) > > > Good question. I just did a quick test. On a P4 2.8GHz machine, it takes > approx 0.0009 sec to memcpy 1MB of memory. So if you want to process at > 30frames/sec, the memcpy will take up approx 3% of the available time. While browsing through the multi-buffering section of the MIL Lite manual I came across the following scenario (as if we didn't have enough discussion going on)... what about the case when an ocassional frame takes more time to process than the time required for a capture? So, it says that the way to solve this is to hook the grab function (that would be advance_start()) to the GRAB_END event, for instance. This way the processing is interrupted to start the grab and then resumes, enabling the processing to catch up. Obviosly, on average your processing needs to be faster than your capturing, but if once in a while your processing peaks you don't need to drop any frames. The problem here is that in our interface we are probing for when is_frame_available() instead of using callbacks... any ideas on how to support this cleanly? --Miguel |
From: Matt L. <mat...@gm...> - 2006-01-05 15:53:26
|
Wow, this discussion has really pick up now. Here are a few more of my thoughts. On 1/4/06, Amitha Perera <pe...@cs...> wrote: > 1. Parameter blocks. > > First, a class should not inherit from it's parameter > block. Inheritance should be a is-a relationship, and that does not > apply. The cost is simply a five character prefix: > struct object_params { > int p1, p2; > }; > class object { > object_params prms_; > public: > object( object_params const& p ) : prms_(p) { } > > int foo() { return prms_.p1 + prms_.p2; } > // instead of > // int foo() { return p1 + p2; } > }; > > Parameter blocks should have (clearly documented) default values, and > set methods to change them. > > struct video_out_params { > //: The width of the image (default 320). > int width_; > > //: The height of the image (default 240). > int height_; > > video_out_params() > : width_( 320 ), > height_( 240 ) > { } > > //: \sa width > video_out_params& set_width( int v ) { width_ =3D v; return *this; = } > > //: \sa height > video_out_params& set_height( int v ) { height_ =3D v; return *this= ; } > }; I'm using this exact model for the ffmpeg_ostream_params that I am working on now. I think this is the way to go. Of course we should also support reading and writing these parameter blocks to disk. It would be nice to use XML for this, but there is currently no standard way of supporting XML parsing in VXL. Maybe we can use Expat or something. > 2. Factory classes > > The way to get a run-time configurable input or output stream would > be to use a factory class _for_the_whole_stream_. > > vidl2_istream_sptr stream =3D vidl2_create_stream_from_xml( "config.xml= " ); > > The factory would then understand the available streams, and delegate > to other classes to actually create the appropriate stream objects. > This is a good idea if we have some standardized file format for all types of stream configuration files (such as XML). I see no reason not to support this sort of run-time configuration as long as the compile time configuration is also available. In other words, I don't think the factory should be the only way to create a stream. Also, I agree with Miguel that probing all available stream types is probably not a good idea to automatically determine the stream type from a pointer to a file or device. > 3. Memory allocation > > I think the default behaviour of read_frame should be to give access > to an image that is valid until the next call to read_frame. This > allows a framegrabber to create an image_view that simply wraps the > framegrabber's memory, if that is possible. Providing a pre-allocated > memory segment does not allow for this. > > If the user wishes to keep the image around for longer, they would > have to call deep_copy on that. If the efficiency of that becomes a > question, then a method read_frame_copy() could generate a image_view > that would not be subsequently modified by the video reader, and the > video reader could implement this as efficiently as possible. > > I think that video users should be aware of the efficiencies > associated with memory allocation, and should consciously choose to > do a deep copy. Maybe the read_frame_copy() will solve my problem. I'm actually thinking more about streams that don't capture live images using special memory. In the current API, read_frame() returns a vil_image_resource_sptr, not a vil_image_view. This has the advantage that the image may not actually be in memory yet (it could be a file on disk, a frame in an encoded video on disk, etc.). So I can hang onto these resources without filling all of my memory. I want to be able to hang onto some of these resources without worrying that they will become invalid at some point. And I don't want to do a deep_copy because that would load the image into memory. Maybe the other read_frame() function should always return a vil_image_view or a vil_memory_image with the stipulation that the data is only valid for a limited time. I do think that the stream should support multiple buffers (especially if the capture hardware supports it). I don't have much experience with this sort of hardware so I really can't suggest a good way of dealing with this. > 4. Asynchronous capture > > In general, I prefer two methods for getting images: advance() and > current_frame(). The simple loop would then be something like > while( stream.advance() ) { > process( stream.current_frame() ); > } > > For async capture, advance() could be divided into advance_start() > (non-blocking), advance_wait() (blocking), is_frame_available() > (non-blocking). This is fine with me. Would all streams have these async functions or would we want a separate subclass of streams that support this? If all streams had these functions how should streams without async support handle these functions? --Matt |
From: Miguel A. Figueroa-V. <mi...@ms...> - 2006-01-06 12:57:07
|
Matt Leotta wrote: >>1. Parameter blocks. > > I'm using this exact model for the ffmpeg_ostream_params that I am > working on now. I think this is the way to go. Of course we should > also support reading and writing these parameter blocks to disk. It > would be nice to use XML for this, but there is currently no standard > way of supporting XML parsing in VXL. Maybe we can use Expat or > something. If we can work with a map<string,string> to store the key, value (this implies that there is no complex nesting and so on), then the parser and format of the configuration file can be decoupled. Of course, we still need to process the strings to load them into corresponding entries in the parameter block structure, but then you can use mul's <mbl/mbl_read_props.h> to parse the configuration file and return a map<string,string> with the values. You can then focus on the map<string,string> to params end of the equation... >>4. Asynchronous capture >> >>In general, I prefer two methods for getting images: advance() and >>current_frame(). The simple loop would then be something like >> while( stream.advance() ) { >> process( stream.current_frame() ); >> } >> >>For async capture, advance() could be divided into advance_start() >>(non-blocking), advance_wait() (blocking), is_frame_available() >>(non-blocking). > > > This is fine with me. Would all streams have these async functions or > would we want a separate subclass of streams that support this? If > all streams had these functions how should streams without async > support handle these functions? Do we need both advance and advance_wait? They both do the same, right? I guess that for synch-only streams you can either let advance_start call advance(), and is_frame_available() always return true. Or let both of them throw an exception? --Miguel |