From: Donal K. F. <don...@ma...> - 2009-02-23 10:41:34
|
Alexandre Ferrieux wrote: > On the margin of discussions about the init overhead of database > engines, I am wondering whether the following could make sense: > > - allow for the hashtable behind a dict to be saved (serialized) in > its internal form (not the strep) Well, the natural iteration order is the one that is in the string rep. The parts of the dict that are not saved all don't matter that much (e.g. pointers to preceding and following entries or the computed count of the number of entries) because they're implied by the string rep. > - allow for the resulting serialized form to be directly mmap()ped > and reused in a (readonly) dict Dicts are poorly suited to mmap()ing, because they're chock full of real pointers (as opposed to offsets within an array). Plus a dict isn't much use without the Tcl_Objs to go into it. > I know the overall process is no rocket science, it is just yet > another hash-based micro-database engine, but I'd like your opinion on > how it could blend with Tcl. With great difficulty. Dicts are explicitly and utterly about being values, and Tcl's values don't have their own identities; one copy of a value is very much like another. > But the main idea is to provide something that could instantly read > and use a very large dict, with absolutely zero I/O overhead if the > pages are still in RAM after recent use. > (I'm doing this already outside the context of Tcl, and it really > rocks: fork/exec/lookup under a jiffy) I think you'd do better with reworking the guts of array so that you can plug in there. More generally, it's something I've wanted to look into myself for a while, with an aim to allowing the setup of different types of array (e.g. an integer-indexed array of variables, or even an array of doubles, so putting BLT vectors on a more formal basis) and the fact that arrays are complex named entities means that there's a sensible place to hang the extra bits of state off. But it has been on my "to do" list for a while now, so don't feel that you're intruding. If you want to have a go and report back what you find out, be my guest. But in general, I think attaching to arrays has much better "Architectural Smell" (by analogy with "Code Smell" :-)) than attaching to dicts. In short, your current plan is a bit wonky but the larger picture scheme should be quite practical. Donal. |