From: Alan S. <aj...@fr...> - 2008-01-26 20:41:34
|
> Is it possible to serialize to disk the memory data (binary) used by a > judy array? So basically, let's say you create your array then you > save it on the disk in binary form, then later you can restore it and > have it back without repeating the previous operations. > > Could you please provide a sample of how one can accomplish this? Unfortunately there's no API for this, unless someone has created one since 2002 when libJudy was opensourced and I don't know about it. We did talk about it a great deal before 2002, but it's another possible feature that fell by the wayside when the project was canceled. We called this, "persistent Judy arrays." We understand the desire for them. The closest we got was to creating some batch insertion functions that, if I recall right, are undocumented but in the source code, and, "not known not to work." You might look for those. Using them would still require a first/next loop to write out all array values in some form (ASCII or binary) that is later read back for re-insertion (batch or not). -- Note that the hard part about saving any binary data structure out of memory to disk is what's commonly referred to as "pointer fixups". You can't ensure that the data blocks constituting your "database" will come back to the same memory addresses. Therefore any node pointing to any other node is volatile. You'd need a way to keep track of them all ("meta-data") and fix them up upon rereading. The "meta-data" can be null on disk and created on the fly while reading, if you have a way to scan the saved data to unambiguously locate pointers/addresses in order to "fix them up." (You might even use a temporary JudyL array to map old to new addresses. :-) Also, applications that chunk their databases into large, self-managed blocks can make pointer fixups (of their own, application-specific data) simpler/faster. But late in JudyIV development we gave up on having our own "small block memory manager" because straight malloc()/free() calls were just as good, if not better, given a decent malloc() library. Also note that by its nature, Judy arrays have many relatively small nodes, meaning many pointers, although they are often held in a very compressed form, not just simple addresses. So you can see, it almost might not pay to try to save and restore the binary data anyway. The pointer fixup overhead time might swamp any savings over simple traverse/write/read/insert. I'm sure Doug will follow up with more/different perspective. Alan Silverstein |