I would question this assumption:

This is too much for sending over the network regularly as part of a save.

Well, we've got our play-through logs showing how much data was being sent per day per user, and this part of the save was the bit being updated the most. Over a day of play it was adding up to about 5mb I think (can't remember precisely now) and this made it into the top three we needed to compress better to reduce the client bandwidth cost (mobile carrier data limits and all that).

If I had to compress the data you talk about, I might want to look into some implicit representation, like a quad tree with filled/not nodes. Something like:
"Does the current sub-area have entities? If not, store 0, and terminate. Else store 1. If the size of the sub-area is greater than one, subdivide, and recurse for each sub-quadrant."
Depending on how clustered the entities are, this may compress well or poorly (but with a max 7% fill rate, it ought to at least compress somewhat.)

I had thought about a quad tree, but I think I chose to not go that way because my previous experience with them was that they're not quite as efficient in practice as the should be. I can't remember why I think this, but I remember working through some code and thinking something along the lines of "Oh, yeah. Of course" but that might have been because the data involved wasn't quite as clumpy. 
Apply gzip on top of any encoding you come up with for possible bonus gains.

that's part of the network layer, so no point in doing it explicitly.


Thanks, I might try the quad tree idea again.