I would question this assumption:

This is too much for sending over the network regularly as part of a save.

How often do you send the savegame? 7k entities, times 20 bits, equals less than 18 kB of data.
By comparison, the amount of cookie and image data involved in a single mouseover-and-click event in a web browser these days is probably about that same size.
Are you saying that your save data must be more lightweight than a mouse click? Where does this limitation come from? Are you targeting a modem where a 5 second background upload is not acceptable?

If I had to compress the data you talk about, I might want to look into some implicit representation, like a quad tree with filled/not nodes. Something like:
"Does the current sub-area have entities? If not, store 0, and terminate. Else store 1. If the size of the sub-area is greater than one, subdivide, and recurse for each sub-quadrant."
Depending on how clustered the entities are, this may compress well or poorly (but with a max 7% fill rate, it ought to at least compress somewhat.)

Apply gzip on top of any encoding you come up with for possible bonus gains.




On Tue, Jan 14, 2014 at 4:37 AM, Richard Fabian <raspo1@gmail.com> wrote:
The web is full of different solutions for compressing things, but some of you have probably already done this and found a really good ratio compression scheme that fits what I'm asking about.

I've got a load of entites, (4-7k) that live on a grid (1024x1024), and I need to persist them to a savegame which gets sent across the network to keep the save safe. I need to compress it. I'm compressing the entities in the world already, ones that have been affected cost a few bytes, but the but for the ones that don't have any changes, all I'm left with is the coords. There are only a few that need the fatter compression, but at the numbers I'm looking at, those "unchanged entity" coords add up to a lot of data when storing as 20 bit structs. This is too much for sending over the network regularly as part of a save.

I've had some ideas on how to compress the data, but they might be crap. I don't know.

I can't easily regenerate the set so thought someone who wanted to compress consumables might know the name of a good solution for sparse bitset compression. The bits are clumped around different areas of the grid, so I feel that something that leverages spatial coherence might do well.

Any leads would be highly appreciated.

