Re: [libdv-dev] Code size

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Which is faster: unrolling the loops and growing past 12K
or leaving the loops in and keeping it under?

Switching to a block-based ycrcb_to_rgb gave me about 5%
speed improvement over full-frame conversion. This
and other changes are with Erik for review.
(Got rid of place.c, broke PAL decoding, repackaged
closer to library form, added dv2ppm.c, ...)

I propose we stay away from kernel hacks for as long
as possible. Ideally we should keep maintaining C-versions
of each routine, to assist in cross-platform
development. I'm sure the LinuxPPC folks will want this
code, and if we keep it populated with too much
ia32, they may revolt!

We may also want to offer speed vs. quality options for
our users: One improvement is to skip the third pass AC
decoding and just return from dv_parse_video_segment()
without calling dv_parse_ac_coeffs(seg). In playback,
the additional error is barely noticable. (I had to use
dv2ppm to grab frames and compare the results.)
For some uses, like DV editting, where speed is more
important than quality, I'd even be willing to forego
*ALL* the AC decoding. Just give me 8x8 blocks of DC,
which my tests show runs more than 3x faster-- those
ducks look awfully blocky, though!
There may be other intermediate "exit-points" in the decoder that
we'll want to maintain as options. (Y_ONLY is another
example: great for video editting when detail is needed,
but color isn't.)

Erik Walthinsen wrote:
> 
> On Thu, 27 Apr 2000, James Bowman wrote:
> 
> > I took a look at module sizes and found that we're at about 16k of text:
> > we're blowing the code cache, and when code moves around (like when you
> > remove a module) different functions are cacheing against each other and
> > changing performance in surprising ways.
> Whee! ;-)
> 
> > The code should get smaller as we optmize it, though, so this effect
> > will go away.  We should be safe if the decode loop fits in 12k.
> Yeah, we can do that pretty easily.  Eventually I expect that a sufficient
> percentage of this will be written in ASM to keep it well below that.
> Then of course we have to worry about blowing the data cache.  That means
> all sorts of tricks, most of which aren't set up yet (such as using
> non-cachable pages, which means a kernel hack).
>