Matt Leotta wrote:
> work primarily in Linux. I'll try to address your concerns to the
> best of my ability and hopefully we can work out a common interface.
I'm not much of an expert either... so, I guess it would be nice to have
the feedback of more experienced members in this area.
> I agree. Some video feeds will have many parameters while others will
> have very few. My hope was that we could always specify the
> parameters during stream construction (or in some initialization
> routine) and then simply request frames. This way we would have a
> common interface to cameras and encoded video files.
I agree on this. I think all configuration should be done at
construction and then just request video frames or work with the stream
as needed. This would help keep a clean interface for the stream
handling, without the clutter of many functions to adjust every possible
>>The way I am thinking of it is to have live feeds (e.g.,
>>vidl2_cmu1394_istream, vidl2_v4l_istream, vidl2_directshow_istream)
>>inherit from vidl2_istream and a parameters class hierarchy that
>>inherits from vidl2_istream_params, which would encapsulate the
>>parameters used for creation.
> Absolutely. A parameters structure would be very useful for camera
> streams when there are many parameters to work with. However, I'm not
> so sure about the reason for a parameters class hierarchy. Is there
> really a well defined hierarchy for parameters classes? What is the
> advantage over simply defining a parameters class for each stream that
> needs one?
I'm really not sure... there are parameters that might overlap (e.g.,
size_x, size_y, mutltibuffering, multichannel, etc.), but I don't think
this would be reason enough to have a hierarchy of classes for params.
I think it boils down on how you want to configure the params class and
create the stream.
As Dr. McCane mentioned we would like to support stream creation as in
the following for simple configurations:
// not familiar with this, but assume is linux/v4l way...
// for video files
But for more complex configurations you should have a way of configuring
the stream. The easiest approach in my opinion is to have a
configuration file parsed to load the options into the params class and
let the params create the stream using a factory method. In this case if
there is a params hierarchy, then I don't need to know the type to
instantiate at compilation time (create_params returns a
vidl_istream_params instead of vidl_istream_dshow_params):
vidl_istream_params *params =
vidl_istream *input = params->create_stream();
I can also see code like this coming in handy:
// create a params object (type not known at compile time)
vcl_string type = "DirectShow";
vidl_istream_params *params =
// the configure some settings, possibly through GUI menu items
// then create the stream as:
vidl_istream *input = params->create_stream();
These are ideas on how I think things could work, but I don't know what
the standard way of doing things is... So, I guess one of the first
tasks is to decide on what type of interface for stream creation we want
to support. Whether params will be a hierarchy of classes will be
determined by that. I guess it might be possible to design a templates
>>2. My other concern is asynch vs synch acquisition. I think this is
>>crucial for video capture applications in which for example multiple
>>frame grabber boards are used and you would want to start acquiring from
>>both "pseudo-"simultaneously and not wait until one has finished to be
>>able to start the next. I think this should be part of the vidl2_istream
>>interface in a similar way that seekable streams are handled. In this
>>case, functions such as is_asynch_capable(), read_frame_asynch(),
> That's a good point, but maybe we can handle this with a more general
> interface. If we had a is_image_ready() function in the base
> vidl2_istream class it could act as is_capture_finished() but for all
> input streams. This function would always return true for other types
> of input streams. Then you could treat all input streams as asynch
> capable and you also wouldn't need a separate read_frame_asynch()
> function. Would this handle what you are talking about? I don't have
> much experience in this area so let me know if I am missing the point.
> Of course, the other issue is synchronized collection. I haven't
> looked into that much.
Well, I think it is nice to have a different member function for reading
synch/asynch... this is to facilitate things for the user, which can get
confused if he calls input.read_frame() and then starts working with the
frame without knowing that it isn't finished.
I think is_image_ready() works fine. And the safest thing would be to
let read_frame() default to synchronous capture and have a
read_frame_asynch() for asynch (it can of course do the same as
read_frame() if the device doesn't support asynch or throw an
exception). Also, note that synch capture can always be implemented as
an asynch that waits for image to be ready:
vil_image_resource_sptr temp = read_frame_asynch();
The code for multiple asynch captures would look something like:
while ( !input1.is_image_ready()
&& !input3.is_image_ready() );
// process the frames
>>3. Another topic, which I'm definetely not familiar with (but need to do
>>the research on) is the writing to disk. In many live video processing
>>applications you might not need to write to disk, while in others your
>>sole need is to capture the stream to disk for offline processing. In
>>this case I am assuming that the frame grabbers have mechanisms to
>>directly transfer the acquired data from it's memory to the hard drive
>>with minimum utilization of the CPU. In this case, there is a need to be
>>able to connect a vidl2_istream with a vidl2_ostream with as little
>>overhead as possible, specifically without the extra image copying.
> I've also been thinking about this. Connecting a vidl2_istream to a
> vidl2_ostream should be as simple as passing each
> vil_image_resource_sptr from vidl2_istream.read_frame() to
> vidl2_ostream.write_frame(). The smart pointer will prevent extra
> copying of the image data. However, a bigger concern might be memory
> allocation. The istream will need to allocate new memory for each
> frame. The user will typically not want to overwrite the data from
> the previous frame. Yet, in some cases this is exactly what we want
> because it is faster and we no longer need the previous frame after it
> is used.
> I am thinking of adding a function that allows the user to optionally
> supply the memory to be used.
In the case of video capture (and I think this applies to file input as
well) you don't want to be allocating things per frame. I think you
should have a set of buffers allocated (vector<vil_image_resource_sptr>)
and cycle through it. For example in the case of double buffering you
can use two buffers, while you process one you capture in the other.
This can be handled in the vidl_istream class hidden from the user. The
user can always do a deep copy of his smart pointer handle if he wishes
not to let that frame be destroyed in the next acquisition.
What concerns me is that the buffers that vidl_istream allocates will be
in the heap, while I think we can have more efficient use of frame
grabbers through their API (MIL Lite for Matrox for example) to collect
to the memory in the frame grabber and copy from there directly to the
hard drive. In this case I'm not sure we will be able to wrap this to a
For example for MIL/Matrox the API is as follows:
// over simplified sequence to give an idea...
MdigGrab(dig_id, buff_id); // grabs the frame to device
MbufGet(buff_id, (void *) image); // copy the buffer from device
// to user supplied memory
MbufExport("file.bmp", M_BMP, buff_id); // save buffer to disk
dig_id and buff_id are integer identifiers that are used internally to
identify previously allocated resources.
The thing to note here is that if I want to pass the image from device
to disk directly I don't need the MbufGet function call. So, a
vidl_istream::read_frame() would usually wrap the MdigGrab and MbufGet
calls and I would use the vidl_ostream::write_frame() to save the frame.
But this is inefficient because I could have done this without the extra
image copy (MbufGet) which is expensive... I think this is true for any
device that can handle DMA memory (that is, you don't have a handle on
the frame unless you copy it into vidl_istream supplied memory like
MbufGet does), but I'm not sure.
Did I explain my concerns correctly? Any ideas on how to handle such
issues in an efficient way?
> Any contribution you can make will be greatly appreciated. I know
> several people who have been looking for a way to read and write video
> files using DirectShow in VXL.
Well, I am putting some thoughts together as you can see by this email
and I will soon be submitting some code. I guess we need to refine the
interface, but things are looking good. Thanks for taking this project
up. IMHO, frame grabber/camera capture is really missing from the core VXL.