From: Brian H. <bri...@ql...> - 2003-04-25 19:20:35
|
On Fri, 25 Apr 2003, John Max Skaller wrote: > By best you mean 'fastest'. Fastest in the most common cases, to be precise. With "reasonable" (for suitably lose definitions of reasonable) space wastage. With doubling you never waste more than about 75% of the space, and rarely waste 50%. This is actually not bad, compared to other data structures, if you consider the wasted space "infrastructure overhead". Consider a binary tree, where the data is stored only at the leaves. You will have about as many branch nodes as you have leaf nodes. Suddenly have the memory used in branch nodes are references to other branch nodes, not references to data- "wasted" space. Increasing the muliple decreases the number of copies you need to make, at the cost of greatly increasing the "wasted space" on average. Plus, powers of two are generally easier to allocate than powers of anything else. > > Hmm. Could be done with a reference. But this is a very non-ML sort of > > concept. > > I agree: loss of reentrancy .. BUT provided the store is atomic > it only affects performance not semantics if the variable is > changed at random even by multiple threads. Setting a strategy > for a whole program depending on what phase of execution it is up > to is fairly useful. I was just thinking not-functional. Not in that it wouldn't work, but in that it's not how most functional programming flows, if that makes sense. > -------------------------------------------------------- > one thing I found: I needed 3 arguments not 2: > > current slots, current used slots, required size The inputs xarray has to give the resizer algorithm are: - the current number of slots the array has - the current number of slots the array needs Note that if I'm increasing or decreasing the array size, I adjust the number of slots needed before passing it into the resizer. What the resizer returns should be the number of slots the array should have. Once the resizer calculates what size the array should be, the xarray code actually deals with allocating a new array and copying data over (if needed), etc. Maybe these parameters should have a better name. > Conceptually, the reason is: the ideal size > may be influenced by the allocated size for example: > Always. You need to return the current size if you don't want to reallocate the array. > > The extra argument is necessary for asymmetric growth/shrinkage > algorithms, so I recommend you add it: at present you cannot > take 'unused space' into account since it cannot be computed. > Hmm. Consider the following case. You have an array, using the doubling algorithm, with a size of 16 and a length (number of used elements) of 10. Now, you add an enumeration with 100 elements in it. Currently, the resizer function is called with a currsize of 16 (the current number of spaces in the array when it's called) and a length of 110 (how many elements need to fit into the array). Exponential_resizer then keeps doubling 16 until it gets a long enough array size- 128 in this case. It returns 128. If I understand you correctly, in this example you also want 10 (call it 'oldlength') passed in as well. Certainly not difficult. I just don't see what you get out of it. I suppose you could keep the same percentage of free space, and return 176 instead of just 128. If resizer returns anything except the current array size, a new array is allocated and the data copied over (hmm. Must remember to null out the old array after copying everything over). Should having a small amount of data make the algorithm more inclined to copy data over? Brian |