Re: [Schrodinger-devel] Few questions about Dirac video codec

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Thu, Mar 26, 2009 at 11:07:32PM +0100, Norbert Kubiec wrote:
> The question of latency was not unfounded. Have You heard about OnLive? They
> use new interactive video compression algorithm. Latency through the
> algorithm is just 1-ms instead of the 0.5- to 0.75-second lag inherent in
> conventional compression algorithms used in corporate video conferencing
> solutions, for example.

Glad to hear that you totally bought the marketing speak. :)

Rather than respond to your questions directly, I'll talk randomly
about low-latency video codecs work.

One key point about low latency video encoding is that the output
bits that represent the pixel have to exist somewhere in the bitstream
between the time the encoder gets the pixel from the camera, and N ms
later, where N is the latency.

One method of very low-latency compression works on a scanline basis.
An example is the low-delay profile of Dirac.  A camera reads out a
few scan lines (say, 16), the encoder compresses them, and then sends
those bits out over ethernet or ASI or whatever.  The latency is on
the order of a few scan lines, say 16*2 + a small number.  Why 16*2?
Because it takes 16 lines to read in the 16 line chunk, then spends the
time that it takes to read in the next chunk to encode the first chunk
and send it out over the wire.  Simultaneously, the decoder reads in
the data and decodes.  Then during the third set of 16 lines, the decoder
scans out the uncompressed lines.  So the decoder scans out line 0 as
the camera is scanning out line 32.  Real encoders need a bit of extra
time for synchronization, so 32 is ideal.  Of course, in a real system
there is network latency, but we'll make someone else worry about that.
32 lines works out to be abous 1 ms for 1080p at 30 frames per second,
depending on exactly the system you're using.  Compression ratios are
purposefully low, since you can't spread around worst-case bits at all,
and because this kind of compression is only really useful for studio
work.

Note that camera that has a few-scanline latency start at USD 10,000
and an encoder/decoder pair for DiracPro is about USD 4,000, iirc.
This is not the kind of technology you roll out in a consumer product.

Another method is similar, but using an entire frame instead of a few
scan lines.  In this case, you get a theoretical latency of 2 frames,
or about 60 ms for 30 fps video.  I've seen companies advertising
encoder/decoder pairs that claim 70 ms latency (of course, without
any network latency), and I can pretty much believe this number.  Again,
you can't get away with cheap hardware -- my DV camera has an internal
latency somewhere between 90 and 120 ms, and HDV cameras are much worse.

In a frame-based low-latency system, it's much more realistic to use
motion compensation, in which you use the previous one or two frames
as reference pictures.  Since the general point of using motion
compensation is to decrease the bit rate, this causes compression
artifacts immediately after scene changes that clear up after a few
frames, and is very characteristic of the technique.  

Due to the way that Dirac puts together pictures, the non-low-delay
profiles of Dirac has a approximate latency of 4 pictures for a
simple implementation, although you can decrease this to nearly 2
pictures with more complex algorithms.  Schroedinger implements the
simple algorithm, and with suitable modifications (it does not do
this by default) you can get close to 4 frames latency.  Schro's
implementation of Low-Delay Profile is also 4 frames, since it uses
the same code.

Entropy Wave has implementations of the more complex algorithm for
Simple and Intra profiles, as well as an actual low delay implementation
of Low-Delay profile, with latencies that are very near the theoretical
latencies.  These are not open source.  Unfortunately, since all the
code that currently can use these codecs is frame based, there's very
minor advantage over Schroedinger unless you write a bunch of custom
code.

dave...