Thread: [Algorithms] sparse bitset compression
Brought to you by:
vexxed72
From: Richard F. <ra...@gm...> - 2014-01-14 12:38:22
|
The web is full of different solutions for compressing things, but some of you have probably already done this and found a really good ratio compression scheme that fits what I'm asking about. I've got a load of entites, (4-7k) that live on a grid (1024x1024), and I need to persist them to a savegame which gets sent across the network to keep the save safe. I need to compress it. I'm compressing the entities in the world already, ones that have been affected cost a few bytes, but the but for the ones that don't have any changes, all I'm left with is the coords. There are only a few that need the fatter compression, but at the numbers I'm looking at, those "unchanged entity" coords add up to a lot of data when storing as 20 bit structs. This is too much for sending over the network regularly as part of a save. I've had some ideas on how to compress the data, but they might be crap. I don't know. I can't easily regenerate the set so thought someone who wanted to compress consumables might know the name of a good solution for sparse bitset compression. The bits are clumped around different areas of the grid, so I feel that something that leverages spatial coherence might do well. Any leads would be highly appreciated. -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |
From: Oscar F. <os...@tr...> - 2014-01-14 13:11:13
|
Just a random though, so apologies if i'm talking about my arse, but ... couldn't you just use something like the bit plane compression from Wavelet Difference Reduction (WDR)? A lot depends on your data, I appreciate, so this may not work out as any significant saving ... I wrote about the compression method in my blog a few years back: http://trueharmoniccolours.co.uk/Blog/?p=55 Obviously thats in the context of a full wavelet image compressor so you will have to just grab the part I explain in the link above. Also apologies for the code formatting on my blog ... ill fix it one day ;) Oscar On 14 January 2014 12:37, Richard Fabian <ra...@gm...> wrote: > The web is full of different solutions for compressing things, but some of > you have probably already done this and found a really good ratio > compression scheme that fits what I'm asking about. > > I've got a load of entites, (4-7k) that live on a grid (1024x1024), and I > need to persist them to a savegame which gets sent across the network to > keep the save safe. I need to compress it. I'm compressing the entities in > the world already, ones that have been affected cost a few bytes, but the > but for the ones that don't have any changes, all I'm left with is the > coords. There are only a few that need the fatter compression, but at the > numbers I'm looking at, those "unchanged entity" coords add up to a lot of > data when storing as 20 bit structs. This is too much for sending over the > network regularly as part of a save. > > I've had some ideas on how to compress the data, but they might be crap. I > don't know. > > I can't easily regenerate the set so thought someone who wanted to > compress consumables might know the name of a good solution for sparse > bitset compression. The bits are clumped around different areas of the > grid, so I feel that something that leverages spatial coherence might do > well. > > Any leads would be highly appreciated. > > -- > fabs(); > "The fact that an opinion has been widely held is no evidence whatever > that it is not utterly absurd." - Bertrand Russell > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > |
From: Richard F. <ra...@gm...> - 2014-01-14 16:52:43
|
That sounds good. Something like the JPEG VLC for the actual distance between entries and just store deltas on a space filling curve should be good then. On 14 January 2014 13:11, Oscar Forth <os...@tr...>wrote: > Just a random though, so apologies if i'm talking about my arse, but ... > couldn't you just use something like the bit plane compression from Wavelet > Difference Reduction (WDR)? > > A lot depends on your data, I appreciate, so this may not work out as any > significant saving ... > > I wrote about the compression method in my blog a few years back: > > http://trueharmoniccolours.co.uk/Blog/?p=55 > > Obviously thats in the context of a full wavelet image compressor so you > will have to just grab the part I explain in the link above. > > Also apologies for the code formatting on my blog ... ill fix it one day ;) > > Oscar > > On 14 January 2014 12:37, Richard Fabian <ra...@gm...> wrote: > >> The web is full of different solutions for compressing things, but some >> of you have probably already done this and found a really good ratio >> compression scheme that fits what I'm asking about. >> >> I've got a load of entites, (4-7k) that live on a grid (1024x1024), and I >> need to persist them to a savegame which gets sent across the network to >> keep the save safe. I need to compress it. I'm compressing the entities in >> the world already, ones that have been affected cost a few bytes, but the >> but for the ones that don't have any changes, all I'm left with is the >> coords. There are only a few that need the fatter compression, but at the >> numbers I'm looking at, those "unchanged entity" coords add up to a lot of >> data when storing as 20 bit structs. This is too much for sending over the >> network regularly as part of a save. >> >> I've had some ideas on how to compress the data, but they might be crap. >> I don't know. >> >> I can't easily regenerate the set so thought someone who wanted to >> compress consumables might know the name of a good solution for sparse >> bitset compression. The bits are clumped around different areas of the >> grid, so I feel that something that leverages spatial coherence might do >> well. >> >> Any leads would be highly appreciated. >> >> -- >> fabs(); >> "The fact that an opinion has been widely held is no evidence whatever >> that it is not utterly absurd." - Bertrand Russell >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |
From: Oscar F. <os...@tr...> - 2014-01-14 17:01:19
|
Pretty much. The space filling curve might not be necessary but you may be able to use it to keep smaller gaps between the entries which are local to each other spatially. On 14 January 2014 16:52, Richard Fabian <ra...@gm...> wrote: > That sounds good. Something like the JPEG VLC for the actual distance > between entries and just store deltas on a space filling curve should be > good then. > > > On 14 January 2014 13:11, Oscar Forth <os...@tr...>wrote: > >> Just a random though, so apologies if i'm talking about my arse, but ... >> couldn't you just use something like the bit plane compression from Wavelet >> Difference Reduction (WDR)? >> >> A lot depends on your data, I appreciate, so this may not work out as any >> significant saving ... >> >> I wrote about the compression method in my blog a few years back: >> >> http://trueharmoniccolours.co.uk/Blog/?p=55 >> >> Obviously thats in the context of a full wavelet image compressor so you >> will have to just grab the part I explain in the link above. >> >> Also apologies for the code formatting on my blog ... ill fix it one day >> ;) >> >> Oscar >> >> On 14 January 2014 12:37, Richard Fabian <ra...@gm...> wrote: >> >>> The web is full of different solutions for compressing things, but some >>> of you have probably already done this and found a really good ratio >>> compression scheme that fits what I'm asking about. >>> >>> I've got a load of entites, (4-7k) that live on a grid (1024x1024), and >>> I need to persist them to a savegame which gets sent across the network to >>> keep the save safe. I need to compress it. I'm compressing the entities in >>> the world already, ones that have been affected cost a few bytes, but the >>> but for the ones that don't have any changes, all I'm left with is the >>> coords. There are only a few that need the fatter compression, but at the >>> numbers I'm looking at, those "unchanged entity" coords add up to a lot of >>> data when storing as 20 bit structs. This is too much for sending over the >>> network regularly as part of a save. >>> >>> I've had some ideas on how to compress the data, but they might be crap. >>> I don't know. >>> >>> I can't easily regenerate the set so thought someone who wanted to >>> compress consumables might know the name of a good solution for sparse >>> bitset compression. The bits are clumped around different areas of the >>> grid, so I feel that something that leverages spatial coherence might do >>> well. >>> >>> Any leads would be highly appreciated. >>> >>> -- >>> fabs(); >>> "The fact that an opinion has been widely held is no evidence whatever >>> that it is not utterly absurd." - Bertrand Russell >>> >>> >>> ------------------------------------------------------------------------------ >>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, Development Environments & Everything In Between. >>> Get a Quote or Start a Free Trial Today. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> GDAlgorithms-list mailing list >>> GDA...@li... >>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>> Archives: >>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>> >> >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > -- > fabs(); > "The fact that an opinion has been widely held is no evidence whatever > that it is not utterly absurd." - Bertrand Russell > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > |
From: Chris B. <cb...@gm...> - 2014-01-14 19:11:47
|
You might also look into morton order encoding, as it is a simple and efficient locality preserving way to do lossless n dimensional tree structures - this is of course best for applications like map data. Your mileage may vary. -Chria ἀπὸ μηχανῆς iPhone 5 > On Jan 14, 2014, at 9:01 AM, Oscar Forth <os...@tr...> wrote: > > Pretty much. > > The space filling curve might not be necessary but you may be able to use it to keep smaller gaps between the entries which are local to each other spatially. > > >> On 14 January 2014 16:52, Richard Fabian <ra...@gm...> wrote: >> That sounds good. Something like the JPEG VLC for the actual distance between entries and just store deltas on a space filling curve should be good then. >> >> >>> On 14 January 2014 13:11, Oscar Forth <os...@tr...> wrote: >>> Just a random though, so apologies if i'm talking about my arse, but ... couldn't you just use something like the bit plane compression from Wavelet Difference Reduction (WDR)? >>> >>> A lot depends on your data, I appreciate, so this may not work out as any significant saving ... >>> >>> I wrote about the compression method in my blog a few years back: >>> >>> http://trueharmoniccolours.co.uk/Blog/?p=55 >>> >>> Obviously thats in the context of a full wavelet image compressor so you will have to just grab the part I explain in the link above. >>> >>> Also apologies for the code formatting on my blog ... ill fix it one day ;) >>> >>> Oscar >>> >>>> On 14 January 2014 12:37, Richard Fabian <ra...@gm...> wrote: >>>> The web is full of different solutions for compressing things, but some of you have probably already done this and found a really good ratio compression scheme that fits what I'm asking about. >>>> >>>> I've got a load of entites, (4-7k) that live on a grid (1024x1024), and I need to persist them to a savegame which gets sent across the network to keep the save safe. I need to compress it. I'm compressing the entities in the world already, ones that have been affected cost a few bytes, but the but for the ones that don't have any changes, all I'm left with is the coords. There are only a few that need the fatter compression, but at the numbers I'm looking at, those "unchanged entity" coords add up to a lot of data when storing as 20 bit structs. This is too much for sending over the network regularly as part of a save. >>>> >>>> I've had some ideas on how to compress the data, but they might be crap. I don't know. >>>> >>>> I can't easily regenerate the set so thought someone who wanted to compress consumables might know the name of a good solution for sparse bitset compression. The bits are clumped around different areas of the grid, so I feel that something that leverages spatial coherence might do well. >>>> >>>> Any leads would be highly appreciated. >>>> >>>> -- >>>> fabs(); >>>> "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell >>>> >>>> ------------------------------------------------------------------------------ >>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>> Critical Workloads, Development Environments & Everything In Between. >>>> Get a Quote or Start a Free Trial Today. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> GDAlgorithms-list mailing list >>>> GDA...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>>> Archives: >>>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>> >>> >>> ------------------------------------------------------------------------------ >>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, Development Environments & Everything In Between. >>> Get a Quote or Start a Free Trial Today. >>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> GDAlgorithms-list mailing list >>> GDA...@li... >>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>> Archives: >>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> >> >> >> -- >> fabs(); >> "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list |
From: Richard F. <ra...@gm...> - 2014-01-14 19:41:11
|
I was going to plug in the morten coded version. Code from ryg's blog. http://fgiesen.wordpress.com/2009/12/13/decoding-morton-codes/ On 14 January 2014 19:11, Chris Birke <cb...@gm...> wrote: > You might also look into morton order encoding, as it is a simple and > efficient locality preserving way to do lossless n dimensional tree > structures - this is of course best for applications like map data. Your > mileage may vary. > > -Chria > > ἀπὸ μηχανῆς iPhone 5 > > On Jan 14, 2014, at 9:01 AM, Oscar Forth <os...@tr...> > wrote: > > Pretty much. > > The space filling curve might not be necessary but you may be able to use > it to keep smaller gaps between the entries which are local to each other > spatially. > > > On 14 January 2014 16:52, Richard Fabian <ra...@gm...> wrote: > >> That sounds good. Something like the JPEG VLC for the actual distance >> between entries and just store deltas on a space filling curve should be >> good then. >> >> >> On 14 January 2014 13:11, Oscar Forth <os...@tr...>wrote: >> >>> Just a random though, so apologies if i'm talking about my arse, but ... >>> couldn't you just use something like the bit plane compression from Wavelet >>> Difference Reduction (WDR)? >>> >>> A lot depends on your data, I appreciate, so this may not work out as >>> any significant saving ... >>> >>> I wrote about the compression method in my blog a few years back: >>> >>> http://trueharmoniccolours.co.uk/Blog/?p=55 >>> >>> Obviously thats in the context of a full wavelet image compressor so you >>> will have to just grab the part I explain in the link above. >>> >>> Also apologies for the code formatting on my blog ... ill fix it one day >>> ;) >>> >>> Oscar >>> >>> On 14 January 2014 12:37, Richard Fabian <ra...@gm...> wrote: >>> >>>> The web is full of different solutions for compressing things, but some >>>> of you have probably already done this and found a really good ratio >>>> compression scheme that fits what I'm asking about. >>>> >>>> I've got a load of entites, (4-7k) that live on a grid (1024x1024), and >>>> I need to persist them to a savegame which gets sent across the network to >>>> keep the save safe. I need to compress it. I'm compressing the entities in >>>> the world already, ones that have been affected cost a few bytes, but the >>>> but for the ones that don't have any changes, all I'm left with is the >>>> coords. There are only a few that need the fatter compression, but at the >>>> numbers I'm looking at, those "unchanged entity" coords add up to a lot of >>>> data when storing as 20 bit structs. This is too much for sending over the >>>> network regularly as part of a save. >>>> >>>> I've had some ideas on how to compress the data, but they might be >>>> crap. I don't know. >>>> >>>> I can't easily regenerate the set so thought someone who wanted to >>>> compress consumables might know the name of a good solution for sparse >>>> bitset compression. The bits are clumped around different areas of the >>>> grid, so I feel that something that leverages spatial coherence might do >>>> well. >>>> >>>> Any leads would be highly appreciated. >>>> >>>> -- >>>> fabs(); >>>> "The fact that an opinion has been widely held is no evidence whatever >>>> that it is not utterly absurd." - Bertrand Russell >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>> Critical Workloads, Development Environments & Everything In Between. >>>> Get a Quote or Start a Free Trial Today. >>>> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> GDAlgorithms-list mailing list >>>> GDA...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>>> Archives: >>>> >>>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, Development Environments & Everything In Between. >>> Get a Quote or Start a Free Trial Today. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> GDAlgorithms-list mailing list >>> GDA...@li... >>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>> Archives: >>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>> >> >> >> >> -- >> fabs(); >> "The fact that an opinion has been widely held is no evidence whatever >> that it is not utterly absurd." - Bertrand Russell >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |
From: Jon W. <jw...@gm...> - 2014-01-15 21:21:28
|
I would question this assumption: This is too much for sending over the network regularly as part of a save. How often do you send the savegame? 7k entities, times 20 bits, equals less than 18 kB of data. By comparison, the amount of cookie and image data involved in a single mouseover-and-click event in a web browser these days is probably about that same size. Are you saying that your save data must be more lightweight than a mouse click? Where does this limitation come from? Are you targeting a modem where a 5 second background upload is not acceptable? If I had to compress the data you talk about, I might want to look into some implicit representation, like a quad tree with filled/not nodes. Something like: "Does the current sub-area have entities? If not, store 0, and terminate. Else store 1. If the size of the sub-area is greater than one, subdivide, and recurse for each sub-quadrant." Depending on how clustered the entities are, this may compress well or poorly (but with a max 7% fill rate, it ought to at least compress somewhat.) Apply gzip on top of any encoding you come up with for possible bonus gains. Sincerely, jw Sincerely, Jon Watte -- "I pledge allegiance to the flag of the United States of America, and to the republic for which it stands, one nation indivisible, with liberty and justice for all." ~ Adopted by U.S. Congress, June 22, 1942 On Tue, Jan 14, 2014 at 4:37 AM, Richard Fabian <ra...@gm...> wrote: > The web is full of different solutions for compressing things, but some of > you have probably already done this and found a really good ratio > compression scheme that fits what I'm asking about. > > I've got a load of entites, (4-7k) that live on a grid (1024x1024), and I > need to persist them to a savegame which gets sent across the network to > keep the save safe. I need to compress it. I'm compressing the entities in > the world already, ones that have been affected cost a few bytes, but the > but for the ones that don't have any changes, all I'm left with is the > coords. There are only a few that need the fatter compression, but at the > numbers I'm looking at, those "unchanged entity" coords add up to a lot of > data when storing as 20 bit structs. This is too much for sending over the > network regularly as part of a save. > > I've had some ideas on how to compress the data, but they might be crap. I > don't know. > > I can't easily regenerate the set so thought someone who wanted to > compress consumables might know the name of a good solution for sparse > bitset compression. The bits are clumped around different areas of the > grid, so I feel that something that leverages spatial coherence might do > well. > > Any leads would be highly appreciated. > > -- > fabs(); > "The fact that an opinion has been widely held is no evidence whatever > that it is not utterly absurd." - Bertrand Russell > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > |
From: Richard F. <ra...@gm...> - 2014-01-17 10:15:16
|
> > I would question this assumption: > > This is too much for sending over the network regularly as part of a save. > > > Well, we've got our play-through logs showing how much data was being sent per day per user, and this part of the save was the bit being updated the most. Over a day of play it was adding up to about 5mb I think (can't remember precisely now) and this made it into the top three we needed to compress better to reduce the client bandwidth cost (mobile carrier data limits and all that). > If I had to compress the data you talk about, I might want to look into > some implicit representation, like a quad tree with filled/not nodes. > Something like: > "Does the current sub-area have entities? If not, store 0, and terminate. > Else store 1. If the size of the sub-area is greater than one, subdivide, > and recurse for each sub-quadrant." > Depending on how clustered the entities are, this may compress well or > poorly (but with a max 7% fill rate, it ought to at least compress > somewhat.) > I had thought about a quad tree, but I think I chose to not go that way because my previous experience with them was that they're not quite as efficient in practice as the should be. I can't remember why I think this, but I remember working through some code and thinking something along the lines of "Oh, yeah. Of course" but that might have been because the data involved wasn't quite as clumpy. > Apply gzip on top of any encoding you come up with for possible bonus > gains. > > that's part of the network layer, so no point in doing it explicitly. > Sincerely, > > jw > > > Thanks, I might try the quad tree idea again. |
From: Alex W. <ajw...@gm...> - 2014-01-17 18:04:54
|
Its on the fringe of being useful, but one thing you could look at to reduce your data size is Exponential Golomb coding ( http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in H.264 entropy encoding among other things. Instead of writing a full n-bit values for every number you store, it produces a bit stream of codes, using far less bits for small numbers - take a look at the wiki page, its pretty simple to see whats going on when you look at the example. It can reduce the amount of data you have to move, and from what I remember from when I worked with video, it produces much more predictable (and compressible) stream of bits for the next compression scheme along (variable length encoding, or arithmetic encoding in the case of H.264) - I've not looked into the details but could probably improve the compression of the stream you get through gzip. Regards, Alex On Fri, Jan 17, 2014 at 10:14 AM, Richard Fabian <ra...@gm...> wrote: > I would question this assumption: >> >> This is too much for sending over the network regularly as part of a save. >> >> >> > > Well, we've got our play-through logs showing how much data was being sent > per day per user, and this part of the save was the bit being updated the > most. Over a day of play it was adding up to about 5mb I think (can't > remember precisely now) and this made it into the top three we needed to > compress better to reduce the client bandwidth cost (mobile carrier data > limits and all that). > > > >> If I had to compress the data you talk about, I might want to look into >> some implicit representation, like a quad tree with filled/not nodes. >> Something like: >> "Does the current sub-area have entities? If not, store 0, and terminate. >> Else store 1. If the size of the sub-area is greater than one, subdivide, >> and recurse for each sub-quadrant." >> Depending on how clustered the entities are, this may compress well or >> poorly (but with a max 7% fill rate, it ought to at least compress >> somewhat.) >> > > I had thought about a quad tree, but I think I chose to not go that way > because my previous experience with them was that they're not quite as > efficient in practice as the should be. I can't remember why I think this, > but I remember working through some code and thinking something along the > lines of "Oh, yeah. Of course" but that might have been because the data > involved wasn't quite as clumpy. > > >> Apply gzip on top of any encoding you come up with for possible bonus >> gains. >> >> that's part of the network layer, so no point in doing it explicitly. > > >> Sincerely, >> >> jw >> >> >> Thanks, I might try the quad tree idea again. > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > |
From: Samuel M. <sam...@go...> - 2014-01-17 19:11:46
|
Another Idea that might be useful is to think of your positions dataset as a black-and white 1024*1024 bitmap (This gets harder if two entities are allowed to occupy the same position) and compress it with some sort of run-length-encoding, i.e. you go through each pixel line from left to right and instead of saving 0s (for empty space) and 1s (for occupied pixels), you save how many 0s there are until the next 1. On 1/17/14, Alex Walters <ajw...@gm...> wrote: > Its on the fringe of being useful, but one thing you could look at to > reduce your data size is Exponential Golomb coding ( > http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in H.264 > entropy encoding among other things. > > Instead of writing a full n-bit values for every number you store, it > produces a bit stream of codes, using far less bits for small numbers - > take a look at the wiki page, its pretty simple to see whats going on when > you look at the example. It can reduce the amount of data you have to move, > and from what I remember from when I worked with video, it produces much > more predictable (and compressible) stream of bits for the next compression > scheme along (variable length encoding, or arithmetic encoding in the case > of H.264) - I've not looked into the details but could probably improve the > compression of the stream you get through gzip. > > Regards, > > Alex > > > On Fri, Jan 17, 2014 at 10:14 AM, Richard Fabian <ra...@gm...> wrote: > >> I would question this assumption: >>> >>> This is too much for sending over the network regularly as part of a >>> save. >>> >>> >>> >> >> Well, we've got our play-through logs showing how much data was being >> sent >> per day per user, and this part of the save was the bit being updated the >> most. Over a day of play it was adding up to about 5mb I think (can't >> remember precisely now) and this made it into the top three we needed to >> compress better to reduce the client bandwidth cost (mobile carrier data >> limits and all that). >> >> >> >>> If I had to compress the data you talk about, I might want to look into >>> some implicit representation, like a quad tree with filled/not nodes. >>> Something like: >>> "Does the current sub-area have entities? If not, store 0, and >>> terminate. >>> Else store 1. If the size of the sub-area is greater than one, >>> subdivide, >>> and recurse for each sub-quadrant." >>> Depending on how clustered the entities are, this may compress well or >>> poorly (but with a max 7% fill rate, it ought to at least compress >>> somewhat.) >>> >> >> I had thought about a quad tree, but I think I chose to not go that way >> because my previous experience with them was that they're not quite as >> efficient in practice as the should be. I can't remember why I think >> this, >> but I remember working through some code and thinking something along the >> lines of "Oh, yeah. Of course" but that might have been because the data >> involved wasn't quite as clumpy. >> >> >>> Apply gzip on top of any encoding you come up with for possible bonus >>> gains. >>> >>> that's part of the network layer, so no point in doing it explicitly. >> >> >>> Sincerely, >>> >>> jw >>> >>> >>> Thanks, I might try the quad tree idea again. >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > |
From: Richard F. <ra...@gm...> - 2014-01-18 00:10:39
|
hah, that was funny. I looked at the link, then realised that it was the same as what I'm doing except I pack with ones instead of zeros. 0 -> 0 100 -> 1 101 -> 2 11000 -> 3 etc. I'm pretty sure i read about this in a jpeg compression book more than 10 years ago. On 17 January 2014 18:04, Alex Walters <ajw...@gm...> wrote: > Its on the fringe of being useful, but one thing you could look at to > reduce your data size is Exponential Golomb coding ( > http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in > H.264 entropy encoding among other things. > > Instead of writing a full n-bit values for every number you store, it > produces a bit stream of codes, using far less bits for small numbers - > take a look at the wiki page, its pretty simple to see whats going on when > you look at the example. It can reduce the amount of data you have to move, > and from what I remember from when I worked with video, it produces much > more predictable (and compressible) stream of bits for the next compression > scheme along (variable length encoding, or arithmetic encoding in the case > of H.264) - I've not looked into the details but could probably improve the > compression of the stream you get through gzip. > > Regards, > > Alex > > > On Fri, Jan 17, 2014 at 10:14 AM, Richard Fabian <ra...@gm...> wrote: > >> I would question this assumption: >>> >>> This is too much for sending over the network regularly as part of a >>>> save. >>> >>> >>> >> >> Well, we've got our play-through logs showing how much data was being >> sent per day per user, and this part of the save was the bit being updated >> the most. Over a day of play it was adding up to about 5mb I think (can't >> remember precisely now) and this made it into the top three we needed to >> compress better to reduce the client bandwidth cost (mobile carrier data >> limits and all that). >> >> >> >>> If I had to compress the data you talk about, I might want to look into >>> some implicit representation, like a quad tree with filled/not nodes. >>> Something like: >>> "Does the current sub-area have entities? If not, store 0, and >>> terminate. Else store 1. If the size of the sub-area is greater than one, >>> subdivide, and recurse for each sub-quadrant." >>> Depending on how clustered the entities are, this may compress well or >>> poorly (but with a max 7% fill rate, it ought to at least compress >>> somewhat.) >>> >> >> I had thought about a quad tree, but I think I chose to not go that way >> because my previous experience with them was that they're not quite as >> efficient in practice as the should be. I can't remember why I think this, >> but I remember working through some code and thinking something along the >> lines of "Oh, yeah. Of course" but that might have been because the data >> involved wasn't quite as clumpy. >> >> >>> Apply gzip on top of any encoding you come up with for possible bonus >>> gains. >>> >>> that's part of the network layer, so no point in doing it explicitly. >> >> >>> Sincerely, >>> >>> jw >>> >>> >>> Thanks, I might try the quad tree idea again. >> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |
From: Robin G. <rob...@gm...> - 2014-01-21 08:08:16
|
Here's a modern Run Length version of Golomb-Rice encoding designed for compressing general data with a Generalized Gaussian distribution. Works best if you can prove there are, in most cases, small differences between adjacent data (but handily you get to define what "adjacent" means to your stream). Encode a single starting value at full bandwidth then send a stream of differences: https://research.microsoft.com/pubs/102069/malvar_dcc06.pdf As a bonus, if you're encoding depth images, you can remap the values to leave zero as an out-of-band value: http://www.charlesneedham.com/pubs/153971/depthcode-final.pdf Yes, I know that wasn't the question you asked but at least it's an algorithm. :-) - Robin Green On Fri, Jan 17, 2014 at 10:04 AM, Alex Walters <ajw...@gm...>wrote: > Its on the fringe of being useful, but one thing you could look at to > reduce your data size is Exponential Golomb coding ( > http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in > H.264 entropy encoding among other things. > > Instead of writing a full n-bit values for every number you store, it > produces a bit stream of codes, using far less bits for small numbers - > take a look at the wiki page, its pretty simple to see whats going on when > you look at the example. It can reduce the amount of data you have to move, > and from what I remember from when I worked with video, it produces much > more predictable (and compressible) stream of bits for the next compression > scheme along (variable length encoding, or arithmetic encoding in the case > of H.264) - I've not looked into the details but could probably improve the > compression of the stream you get through gzip. > > Regards, > > Alex > > > |
From: Jon W. <jw...@gm...> - 2014-01-31 21:32:10
|
> > Encode a single starting value at full bandwidth then send a stream of > differences: Which is equivalent to a wavelet "lift" transform :-) Back to the original question: If you have a previous save game, is there lots of similarity in the next save? If so, can you save a delta instead? If not, what solution did you end up choosing? Sincerely, jw Sincerely, Jon Watte -- "I find that the harder I work, the more luck I seem to have." -- Thomas Jefferson On Tue, Jan 21, 2014 at 12:08 AM, Robin Green <rob...@gm...> wrote: > > Here's a modern Run Length version of Golomb-Rice encoding designed for > compressing general data with a Generalized Gaussian distribution. Works > best if you can prove there are, in most cases, small differences between > adjacent data (but handily you get to define what "adjacent" means to your > stream). Encode a single starting value at full bandwidth then send a > stream of differences: > > https://research.microsoft.com/pubs/102069/malvar_dcc06.pdf > > As a bonus, if you're encoding depth images, you can remap the values to > leave zero as an out-of-band value: > > http://www.charlesneedham.com/pubs/153971/depthcode-final.pdf > > Yes, I know that wasn't the question you asked but at least it's an > algorithm. :-) > > - Robin Green > > > > On Fri, Jan 17, 2014 at 10:04 AM, Alex Walters <ajw...@gm...>wrote: > >> Its on the fringe of being useful, but one thing you could look at to >> reduce your data size is Exponential Golomb coding ( >> http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in >> H.264 entropy encoding among other things. >> >> Instead of writing a full n-bit values for every number you store, it >> produces a bit stream of codes, using far less bits for small numbers - >> take a look at the wiki page, its pretty simple to see whats going on when >> you look at the example. It can reduce the amount of data you have to move, >> and from what I remember from when I worked with video, it produces much >> more predictable (and compressible) stream of bits for the next compression >> scheme along (variable length encoding, or arithmetic encoding in the case >> of H.264) - I've not looked into the details but could probably improve the >> compression of the stream you get through gzip. >> >> Regards, >> >> Alex >> >> >> > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > |
From: Richard F. <ra...@gm...> - 2014-01-31 23:28:03
|
there was a lot of similarity, but with no processng at the destination, we decided that the Golomb-rice algorithm was good enough. Dropped our data by 70%. That was enough of a saving for this one area. On 31 January 2014 21:32, Jon Watte <jw...@gm...> wrote: > Encode a single starting value at full bandwidth then send a stream of >> differences: > > > Which is equivalent to a wavelet "lift" transform :-) > > Back to the original question: If you have a previous save game, is there > lots of similarity in the next save? If so, can you save a delta instead? > If not, what solution did you end up choosing? > > Sincerely, > > jw > > > > > > > Sincerely, > > Jon Watte > > > -- > "I find that the harder I work, the more luck I seem to have." -- Thomas > Jefferson > > > On Tue, Jan 21, 2014 at 12:08 AM, Robin Green <rob...@gm...>wrote: > >> >> Here's a modern Run Length version of Golomb-Rice encoding designed for >> compressing general data with a Generalized Gaussian distribution. Works >> best if you can prove there are, in most cases, small differences between >> adjacent data (but handily you get to define what "adjacent" means to your >> stream). Encode a single starting value at full bandwidth then send a >> stream of differences: >> >> https://research.microsoft.com/pubs/102069/malvar_dcc06.pdf >> >> As a bonus, if you're encoding depth images, you can remap the values to >> leave zero as an out-of-band value: >> >> http://www.charlesneedham.com/pubs/153971/depthcode-final.pdf >> >> Yes, I know that wasn't the question you asked but at least it's an >> algorithm. :-) >> >> - Robin Green >> >> >> >> On Fri, Jan 17, 2014 at 10:04 AM, Alex Walters <ajw...@gm...>wrote: >> >>> Its on the fringe of being useful, but one thing you could look at to >>> reduce your data size is Exponential Golomb coding ( >>> http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in >>> H.264 entropy encoding among other things. >>> >>> Instead of writing a full n-bit values for every number you store, it >>> produces a bit stream of codes, using far less bits for small numbers - >>> take a look at the wiki page, its pretty simple to see whats going on when >>> you look at the example. It can reduce the amount of data you have to move, >>> and from what I remember from when I worked with video, it produces much >>> more predictable (and compressible) stream of bits for the next compression >>> scheme along (variable length encoding, or arithmetic encoding in the case >>> of H.264) - I've not looked into the details but could probably improve the >>> compression of the stream you get through gzip. >>> >>> Regards, >>> >>> Alex >>> >>> >>> >> >> ------------------------------------------------------------------------------ >> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, Development Environments & Everything In Between. >> Get a Quote or Start a Free Trial Today. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |
From: Colt M. <mai...@gm...> - 2014-03-08 15:13:23
|
Why not just use industry standard Arithmetic Compression? If you calculate the estimated entropy for that data set ( http://planetcalc.com/2476/) you end up with about H=0.06 for 7k set bits in a 1024*1024 stream, and H=0.01 for 1k set bits. That being entropy (or minimum bits per symbol) should yield 70kb and 10k after compression, respectively. Or, with AC, and those sparse values, you can get around 94% compression and 99% compression. (if my early morning math is right) AC is a known algorithm, easy examples to find on the web. And as far a serialization is concerned, it's pretty straight forward to just encode your bits, then decode; no extra special data structures needed. ~Main On Fri, Jan 31, 2014 at 3:27 PM, Richard Fabian <ra...@gm...> wrote: > there was a lot of similarity, but with no processng at the destination, > we decided that the Golomb-rice algorithm was good enough. Dropped our data > by 70%. That was enough of a saving for this one area. > > > On 31 January 2014 21:32, Jon Watte <jw...@gm...> wrote: > >> Encode a single starting value at full bandwidth then send a stream of >>> differences: >> >> >> Which is equivalent to a wavelet "lift" transform :-) >> >> Back to the original question: If you have a previous save game, is there >> lots of similarity in the next save? If so, can you save a delta instead? >> If not, what solution did you end up choosing? >> >> Sincerely, >> >> jw >> >> >> >> >> >> >> Sincerely, >> >> Jon Watte >> >> >> -- >> "I find that the harder I work, the more luck I seem to have." -- Thomas >> Jefferson >> >> >> On Tue, Jan 21, 2014 at 12:08 AM, Robin Green <rob...@gm...>wrote: >> >>> >>> Here's a modern Run Length version of Golomb-Rice encoding designed for >>> compressing general data with a Generalized Gaussian distribution. Works >>> best if you can prove there are, in most cases, small differences between >>> adjacent data (but handily you get to define what "adjacent" means to your >>> stream). Encode a single starting value at full bandwidth then send a >>> stream of differences: >>> >>> https://research.microsoft.com/pubs/102069/malvar_dcc06.pdf >>> >>> As a bonus, if you're encoding depth images, you can remap the values to >>> leave zero as an out-of-band value: >>> >>> http://www.charlesneedham.com/pubs/153971/depthcode-final.pdf >>> >>> Yes, I know that wasn't the question you asked but at least it's an >>> algorithm. :-) >>> >>> - Robin Green >>> >>> >>> >>> On Fri, Jan 17, 2014 at 10:04 AM, Alex Walters <ajw...@gm...>wrote: >>> >>>> Its on the fringe of being useful, but one thing you could look at to >>>> reduce your data size is Exponential Golomb coding ( >>>> http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in >>>> H.264 entropy encoding among other things. >>>> >>>> Instead of writing a full n-bit values for every number you store, it >>>> produces a bit stream of codes, using far less bits for small numbers - >>>> take a look at the wiki page, its pretty simple to see whats going on when >>>> you look at the example. It can reduce the amount of data you have to move, >>>> and from what I remember from when I worked with video, it produces much >>>> more predictable (and compressible) stream of bits for the next compression >>>> scheme along (variable length encoding, or arithmetic encoding in the case >>>> of H.264) - I've not looked into the details but could probably improve the >>>> compression of the stream you get through gzip. >>>> >>>> Regards, >>>> >>>> Alex >>>> >>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, Development Environments & Everything In Between. >>> Get a Quote or Start a Free Trial Today. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> GDAlgorithms-list mailing list >>> GDA...@li... >>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>> Archives: >>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>> >> >> >> >> ------------------------------------------------------------------------------ >> WatchGuard Dimension instantly turns raw network data into actionable >> security intelligence. It gives you real-time visual feedback on key >> security issues and trends. Skip the complicated setup - simply import >> a virtual appliance and go from zero to informed in seconds. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > -- > fabs(); > "The fact that an opinion has been widely held is no evidence whatever > that it is not utterly absurd." - Bertrand Russell > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- == Colt "MainRoach" McAnlis Graphics Programmer |
From: Richard F. <ra...@gm...> - 2014-03-09 16:40:57
|
AC (and now Finite State Entropy https://github.com/Cyan4973/FiniteStateEntropy) are really good, but carry the weight of spending more time than we had to do the task. Goulomb coding the distance between high bits was a great, simple to implement solution for us. Another issue was the compression time. It needed to be quick, and AC is not known to be fast. In future, I hope to use FSE for almost all my lossless coders. On 8 March 2014 15:13, Colt McAnlis <mai...@gm...> wrote: > Why not just use industry standard Arithmetic Compression? > > If you calculate the estimated entropy for that data set ( > http://planetcalc.com/2476/) you end up with about H=0.06 for 7k set bits > in a 1024*1024 stream, and H=0.01 for 1k set bits. > That being entropy (or minimum bits per symbol) should yield 70kb and 10k > after compression, respectively. Or, with AC, and those sparse values, you > can get around 94% compression and 99% compression. (if my early morning > math is right) > > AC is a known algorithm, easy examples to find on the web. And as far a > serialization is concerned, it's pretty straight forward to just encode > your bits, then decode; no extra special data structures needed. > > ~Main > > > On Fri, Jan 31, 2014 at 3:27 PM, Richard Fabian <ra...@gm...> wrote: > >> there was a lot of similarity, but with no processng at the destination, >> we decided that the Golomb-rice algorithm was good enough. Dropped our data >> by 70%. That was enough of a saving for this one area. >> >> >> On 31 January 2014 21:32, Jon Watte <jw...@gm...> wrote: >> >>> Encode a single starting value at full bandwidth then send a stream of >>>> differences: >>> >>> >>> Which is equivalent to a wavelet "lift" transform :-) >>> >>> Back to the original question: If you have a previous save game, is >>> there lots of similarity in the next save? If so, can you save a delta >>> instead? >>> If not, what solution did you end up choosing? >>> >>> Sincerely, >>> >>> jw >>> >>> >>> >>> >>> >>> >>> Sincerely, >>> >>> Jon Watte >>> >>> >>> -- >>> "I find that the harder I work, the more luck I seem to have." -- >>> Thomas Jefferson >>> >>> >>> On Tue, Jan 21, 2014 at 12:08 AM, Robin Green <rob...@gm...>wrote: >>> >>>> >>>> Here's a modern Run Length version of Golomb-Rice encoding designed for >>>> compressing general data with a Generalized Gaussian distribution. Works >>>> best if you can prove there are, in most cases, small differences between >>>> adjacent data (but handily you get to define what "adjacent" means to your >>>> stream). Encode a single starting value at full bandwidth then send a >>>> stream of differences: >>>> >>>> https://research.microsoft.com/pubs/102069/malvar_dcc06.pdf >>>> >>>> As a bonus, if you're encoding depth images, you can remap the values >>>> to leave zero as an out-of-band value: >>>> >>>> http://www.charlesneedham.com/pubs/153971/depthcode-final.pdf >>>> >>>> Yes, I know that wasn't the question you asked but at least it's an >>>> algorithm. :-) >>>> >>>> - Robin Green >>>> >>>> >>>> >>>> On Fri, Jan 17, 2014 at 10:04 AM, Alex Walters <ajw...@gm...>wrote: >>>> >>>>> Its on the fringe of being useful, but one thing you could look at to >>>>> reduce your data size is Exponential Golomb coding ( >>>>> http://en.wikipedia.org/wiki/Exponential-Golomb_coding), its used in >>>>> H.264 entropy encoding among other things. >>>>> >>>>> Instead of writing a full n-bit values for every number you store, it >>>>> produces a bit stream of codes, using far less bits for small numbers - >>>>> take a look at the wiki page, its pretty simple to see whats going on when >>>>> you look at the example. It can reduce the amount of data you have to move, >>>>> and from what I remember from when I worked with video, it produces much >>>>> more predictable (and compressible) stream of bits for the next compression >>>>> scheme along (variable length encoding, or arithmetic encoding in the case >>>>> of H.264) - I've not looked into the details but could probably improve the >>>>> compression of the stream you get through gzip. >>>>> >>>>> Regards, >>>>> >>>>> Alex >>>>> >>>>> >>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For >>>> Critical Workloads, Development Environments & Everything In Between. >>>> Get a Quote or Start a Free Trial Today. >>>> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> GDAlgorithms-list mailing list >>>> GDA...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>>> Archives: >>>> >>>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> WatchGuard Dimension instantly turns raw network data into actionable >>> security intelligence. It gives you real-time visual feedback on key >>> security issues and trends. Skip the complicated setup - simply import >>> a virtual appliance and go from zero to informed in seconds. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >>> >>> _______________________________________________ >>> GDAlgorithms-list mailing list >>> GDA...@li... >>> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >>> Archives: >>> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >>> >> >> >> >> -- >> fabs(); >> "The fact that an opinion has been widely held is no evidence whatever >> that it is not utterly absurd." - Bertrand Russell >> >> >> ------------------------------------------------------------------------------ >> WatchGuard Dimension instantly turns raw network data into actionable >> security intelligence. It gives you real-time visual feedback on key >> security issues and trends. Skip the complicated setup - simply import >> a virtual appliance and go from zero to informed in seconds. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> _______________________________________________ >> GDAlgorithms-list mailing list >> GDA...@li... >> https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list >> Archives: >> http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list >> > > > > -- > == > Colt "MainRoach" McAnlis > Graphics Programmer > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to > Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and > the > freedom to use Git, Perforce or both. Make the move to Perforce. > > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > https://lists.sourceforge.net/lists/listinfo/gdalgorithms-list > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_name=gdalgorithms-list > -- fabs(); "The fact that an opinion has been widely held is no evidence whatever that it is not utterly absurd." - Bertrand Russell |