Re: [Algorithms] sparse bitset compression

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Here's a modern Run Length version of Golomb-Rice encoding designed for
compressing general data with a Generalized Gaussian distribution. Works
best if you can prove there are, in most cases, small differences between
adjacent data (but handily you get to define what "adjacent" means to your
stream). Encode a single starting value at full bandwidth then send a
stream of differences:

    https://research.microsoft.com/pubs/102069/malvar_dcc06.pdf

As a bonus, if you're encoding depth images, you can remap the values to
leave zero as an out-of-band value:

    http://www.charlesneedham.com/pubs/153971/depthcode-final.pdf

Yes, I know that wasn't the question you asked but at least it's an
algorithm. :-)

- Robin Green

On Fri, Jan 17, 2014 at 10:04 AM, Alex Walters <ajw...@gm...>wrote:

> Its on the fringe of being useful, but one thing you could look at to
> reduce your data size is Exponential Golomb coding (
> http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in
> H.264 entropy encoding among other things.
>
> Instead of writing a full n-bit values for every number you store, it
> produces a bit stream of codes, using far less bits for small numbers -
> take a look at the wiki page, its pretty simple to see whats going on when
> you look at the example. It can reduce the amount of data you have to move,
> and from what I remember from when I worked with video, it produces much
> more predictable (and compressible) stream of bits for the next compression
> scheme along (variable length encoding, or arithmetic encoding in the case
> of H.264) - I've not looked into the details but could probably improve the
> compression of the stream you get through gzip.
>
> Regards,
>
> Alex
>
>
>