|
From: Erik W. <om...@cs...> - 2000-04-28 07:26:28
|
On Thu, 27 Apr 2000, James Bowman wrote:
> I'm experimenting with a new bitstream implementation that avoids
> branches. Basically it avoids keeping any state about the bitstream
> around, avoids branches, and makes parse.o 2k smaller. It looks like
> this:
The problem seems to be that it does a lot more work overall. It looks
a lot like Michael Hipp's code from mpg123, which has been replaced with
more stateful bitstream code with much success. In fact, I just checked
and it's not even using MMX getbits yet, and it's still faster.
> static inline guint32 bitstream_show_16(bitstream_t * bs) {
> guint32 a, b, c;
> guint r = bs->offset & 7;
> guint8 *s = &bs->buf[bs->offset >> 3];
>
> return (((s[0] << 16) | (s[1] << 8) | s[2]) >> (8 - r)) & 0xffff;
> }
This is the killer, since shifting regular ia32 registers around is quite
expensive (on the order of 3 cycles per shift, IIRC).
> static inline guint32 bitstream_show(bitstream_t * bs, guint32 num_bits)
> {
> return bitstream_show_16(bs) >> (16 - num_bits);
> }
This is limitted to getting 16 bits, which is OK for libdv but not for a
lot of other codecs. MPEG requires at least 24.
> Here's the trouble: dv_parse_ac_coeffs calls dv_find_spilled_vlc which
> uses bitstream_unget to push bits back into the stream. Any suggestions
> on an alternative implementation which avoids doing this?
Dunno. Buck hasn't had a chance to explain his parsing code to me fully,
and he's out till Tuesday. Hopefully he can answer this question then
(or sooner, if he gets a hold of his mail).
This is one sub-project that needs tackling in the generic sense, though.
I personally have gathered and/or written some dozen bitstream/getbits
routines, all of which have advantages and disadvantages. Merging them
together into a single, releasable toolkit would have major advantages. I
would expect that mpg123 could gain another 20% by using a proper
implementation. mpeg2dec should gain 2-5%. The key is providing a
specializable header file for inlines, such that one might turn off
certain features like large shows (get rid of next_word) and ungets, in
order to simplify most of the routines. Of course, dealing with MMX and
other stuff without being too much of a pain to work with is critical.
I suppose I should put up some of our ideas on this.... Stay tuned.
Erik Walthinsen <om...@cs...> - Staff Programmer @ OGI
Quasar project - http://www.cse.ogi.edu/DISC/projects/quasar/
Video4Linux Two drivers and stuff - http://www.cse.ogi.edu/~omega/v4l2/
__
/ \ SEUL: Simple End-User Linux - http://www.seul.org/
| | M E G A Helping Linux become THE choice
_\ /_ for the home or office user
|