Realtime bigdata tool at the bit level based on immutable AVL forest which can be run in memory or, in future versions, as a merkle forest like a blockchain. Main object is a sparse bit string (Bits) that efficiently scales up to 2^63 bits normally compressed as forest has duplicated substrings. Bits objects support reading bit, byte, short, int, or long (Java primitives) at any bit index in 64 bit range. Example: instead of building a class to hold a header and then data, represent all of that as Bits, subranges of them, and ints for sizes of its parts. Expansion ability for other kinds of compression, since Bits is a Java interface. Main functions on bits are substring, concat, number of 0 or 1 bits, and number of bits (size). All those operations can be done millions of times per second regardless of size because the AVL forest reuses existing branches recursively. Theres a scalar (originally for copy/pasting subranges of sounds) and a bit Java package. Sparse n dimensional matrix.
Features
- AVL tree balancing avoids deep and slow forest
- Bits substring, concat, and count 1 bits in any subrange or combination costs only log time and memory (millions of times per second on average computer)
- Versioning on N dimensional matrix object (Multidim) since its only a view of Bits object. I've tested this on 10000 images from MNIST OCR data.
- Scalar and Bit versions - Originally was scalar for copy/paste subranges of sound. Same operations work for bit strings
- Can store sounds that are years long since its sparse. Same works for bit strings up to 2^63.