Re: [Algorithms] sparse bitset compression
Brought to you by:
vexxed72
From: Alex W. <ajw...@gm...> - 2014-01-17 18:04:54
|
Its on the fringe of being useful, but one thing you could look at to reduce your data size is Exponential Golomb coding ( http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in H.264 entropy encoding among other things. Instead of writing a full n-bit values for every number you store, it produces a bit stream of codes, using far less bits for small numbers - take a look at the wiki page, its pretty simple to see whats going on when you look at the example. It can reduce the amount of data you have to move, and from what I remember from when I worked with video, it produces much more predictable (and compressible) stream of bits for the next compression scheme along (variable length encoding, or arithmetic encoding in the case of H.264) - I've not looked into the details but could probably improve the compression of the stream you get through gzip. Regards, Alex On Fri, Jan 17, 2014 at 10:14 AM, Richard Fabian <ra...@gm...> wrote: > I would question this assumption: >> >> This is too much for sending over the network regularly as part of a save. >> >> >> > > Well, we've got our play-through logs showing how much data was being sent > per day per user, and this part of the save was the bit being updated the > most. Over a day of play it was adding up to about 5mb I think (can't > remember precisely now) and this made it into the top three we needed to > compress better to reduce the client bandwidth cost (mobile carrier data > limits and all that). > > > >> If I had to compress the data you talk about, I might want to look into >> some implicit representation, like a quad tree with filled/not nodes. >> Something like: >> "Does the current sub-area have entities? If not, store 0, and terminate. >> Else store 1. If the size of the sub-area is greater than one, subdivide, >> and recurse for each sub-quadrant." >> Depending on how clustered the entities are, this may compress well or >> poorly (but with a max 7% fill rate, it ought to at least compress >> somewhat.) >> > > I had thought about a quad tree, but I think I chose to not go that way > because my previous experience with them was that they're not quite as > efficient in practice as the should be. I can't remember why I think this, > but I remember working through some code and thinking something along the > lines of "Oh, yeah. Of course" but that might have been because the data > involved wasn't quite as clumpy. > > >> Apply gzip on top of any encoding you come up with for possible bonus >> gains. >> >> that's part of the network layer, so no point in doing it explicitly. > > >> Sincerely, >> >> jw >> >> >> Thanks, I might try the quad tree idea again. > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > |