#9 BTree key compression

open
nobody
None
5
2005-08-22
2005-08-22
Kevin Day
No

Adds hooks for implementing compression of keys in
BTree, plus two implementations of compressors.

If input keys are:

in:

AAAAAA
AAAAAB
AAAAABC

DefaultCompressionProvider, performs no compression

out:

AAAAAA
AAAAAB
AAAAABC

LeadingValueCompressionProvider which provides leading
value compression:

out:

AAAAAA
{5}B
{6}C

The algorithm is developed to work against arbitrary
byte arrays (keys do NOT have to be strings - in fact,
this approach should work quite well for long
serialization, etc...).

Key serializers do need to generate "compatible" output
- meaning that the order of data in the byte array
needs to be conducive to leading key compression.

For example, if you are using strings as your keys, you
need to be sure that the serializer you use does not
place the length of the string at the beginning of the
byte array. Actually, this isn't strictly true - the
LeadingValueCompressionProvider can be configured to
ignore a fixed number of bytes when searching for
common leading byte values - so if you use a serializer
that *always* writes a 2 byte length value to the
beginning of the byte array, you can configure the
provider to ignore those 2 bytes.

- K

Discussion

  • Kevin Day
    Kevin Day
    2005-08-22

    Generalized key compression patch