Re: [Algorithms] sparse bitset compression
Brought to you by:
vexxed72
From: Richard F. <ra...@gm...> - 2014-01-18 00:10:39
|
hah, that was funny. I looked at the link, then realised that it was the same as what I'm doing except I pack with ones instead of zeros. 0 -> 0 100 -> 1 101 -> 2 11000 -> 3 etc. I'm pretty sure i read about this in a jpeg compression book more than 10 years ago. On 17 January 2014 18:04, Alex Walters <ajw...@gm...> wrote: > Its on the fringe of being useful, but one thing you could look at to > reduce your data size is Exponential Golomb coding ( > http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in > H.264 entropy encoding among other things. > > Instead of writing a full n-bit values for every number you store, it > produces a bit stream of codes, using far less bits for small numbers - > take a look at the wiki page, its pretty simple to see whats going on when > you look at the example. It can reduce the amount of data you have to move, > and from what I remember from when I worked with video, it produces much > more predictable (and compressible) stream of bits for the next compression > scheme along (variable length encoding, or arithmetic encoding in the case > of H.264) - I've not looked into the details but could probably improve the > compression of the stream you get through gzip. > > Regards, > > Alex > > > On Fri, Jan 17, 2014 at 10:14 AM, Richard Fabian <ra...@gm...> wrote: > >> I would question this assumption: >>> >>> This is too much for sending over the network regularly as part of a >>>> save. >>> >>> >>> >> >> Well, we've got our play-through logs showing how much data was being >> sent per day per user, and this part of the save was the bit being updated >> the most. Over a day of play it was adding up to about 5mb I think (can't >> remember precisely now) and this made it into the top three we needed to >> compress better to reduce the client bandwidth cost (mobile carrier data >> limits and all that). >> >> >> >>> If I had to compress the data you talk about, I might want to look into >>> some implicit representation, like a quad tree with filled/not nodes. >>> Something like: >>> "Does the current sub-area have entities? If not, store 0, and >>> terminate. Else store 1. If the size of the sub-area is greater than one, >>> subdivide, and recurse for each sub-quadrant." >>> Depending on how clustered the entities are, this may compress well or >>> poorly (but with a max 7% fill rate, it ought to at least compress >>> somewhat.) >>> >> >> I had thought about a quad tree, but I think I chose to not go that way >> because my previous experience with them was that they're not quite as >> efficient in practice as the should be. I can't remember why I think this, >> but I remember working through some code and thinking something along the >> lines of "Oh, yeah. Of course" but that might have been because the data >> involved wasn't quite as clumpy. >> >> >>> Apply gzip on top of any encoding you come up with for possible bonus >>> gains. >>> >>> that's part of the network layer, so no point in doing it explicitly. >> >> >>> Sincerely, >>> >>> jw >>> >>> >>> Thanks, I might try the quad tree idea again. >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |