From: Brian H. <bh...@sp...> - 2003-04-11 23:14:07
Attachments:
xarray.ml
xarray.mli
|
A first cut at resizeable arrays. Some comments: - I haven't made the reallocation strategy explicit. And I'm not sure I like the shrinking strategy. - I stole an idea from Perl, in that negative subscripts index from the end of the array- so get x (-1) always returns the last element of x, no matter how long x is. - No comments. Interface is either that for Array, or hopefully "obvious". Note that there are still a lot of functions to write: - set, insert, and append should have _list, _array, and _xarray variants which only realloc once - remove_range which only reallocs once - sub hasn't been written - Didn't write sort functions yet - map2? |
From: Nicolas C. <war...@fr...> - 2003-04-14 01:12:17
|
> A first cut at resizeable arrays. Thanks Brian, we definitly need it in the ExtLib ! > - I haven't made the reallocation strategy explicit. And I'm not sure I > like the shrinking strategy. The question is "when do you use shrinking ?". It seems that you're using shrinking each time an element is removed, which is I think not the best way of doing it. Typically, the life of an array looks like : - add elements - remove some eventually - do some work with it (iter) - loop The user most of the time does not care at all if it's array is taking a little too much memory, but some users in some cases do. So what I propose is the following : - the current shrinking strategy is only modifying the "len" parameter - we provide a "pack" primitive that is resizing the array to it's "real" size (len) - so users with memory issues can call "pack" after having removed a lots of elements (the user could also use to_array but that will require him to use two different data structures for the same purpose) > - I stole an idea from Perl, in that negative subscripts index from the > end of the array- so get x (-1) always returns the last element of x, no > matter how long x is. If (-1) make some sense, I'm not so sure about (-2) and (-3) :-) So I would prefer an explicit get_last x that is perhaps less confusing to code readers. And , about the name, what about Vector ? It's far more familiar that Xarray :-) > - No comments. Interface is either that for Array, or hopefully > "obvious" > > Note that there are still a lot of functions to write: > > - set, insert, and append should have _list, _array, and _xarray variants > which only realloc once We cannot add all the of_* , to_* and append_* for each data structure, but an iterator can be nice : Xarray.appendi x (List.iter l) Xarray.appendi x (Array.iter a) ... The problem here is that we don't know the exact size of the structure before being enable to make a copy. Perhaps it would be a nice start then to define some kind of data structure generics : type 'a iterator val count : 'a iterator -> int (* remaining elements counter, O(1) *) val next : 'a iterator -> 'a (* raise No_more_elements , complexity unspecified *) val has_more : 'a iterator -> bool (* O(1) *) val iter : ('a -> unit) -> 'a iterator -> unit (* complexity unspecified *) val map : ('a -> 'b) -> 'a iterator -> 'b iterator (* this is a lazy map, I like it ! *) Then each data structure ( list, array, xarray, reflist ... ) will have to_iterator, of_iterator and append_iterator (not sure about the names) functions. For implementation, we can have : type 'a iterator = { mutable count : int; next : (unit -> 'a); } Which is very simple but the data structure which is proving the "next" function has to store it state itself. we could define : type ('a,'b) iterator = { mutable count : int; mutable state : 'b; next : ('b -> 'a * 'b); } but that will make the type information showing a little more about implementation, which is meaningless to the end-user. of maybe something like a functionnal continuation : type 'a fnext = | FEnd | FIter of (unit -> 'a) * 'a fnext type 'a iterator = { mutable count : int; mutable next : fnext; } and then : let iter it = match it.next with | FEnd -> raise No_more_elements | FIter fnext -> let r,f = fnext() in it.count <- it.count - 1; it.next <- f; r But perhaps the cost of matching and returned tuple allocation does not worth it. BTW Brian, if you're willing to create a SF account, I can give you rights on the CVS. Nicolas Cannasse |
From: Blair Z. <bl...@or...> - 2003-04-14 02:04:37
|
Nicolas Cannasse wrote: > > If (-1) make some sense, I'm not so sure about (-2) and (-3) :-) > So I would prefer an explicit get_last x that is perhaps less confusing to > code readers. I have no problem with the -1, -2, -3, etc usage and find is very understandable. I haven't heard any complaints from Perl users on this usage. Best, Blair -- Blair Zajac <bl...@or...> Plots of your system's performance - http://www.orcaware.com/orca/ |
From: Hideo B. <ba...@im...> - 2003-04-14 04:10:54
|
Hello list, At Sun, 13 Apr 2003 19:04:17 -0700, Blair Zajac wrote: > > Nicolas Cannasse wrote: > > > > If (-1) make some sense, I'm not so sure about (-2) and (-3) :-) > > So I would prefer an explicit get_last x that is perhaps less confusing to > > code readers. > > I have no problem with the -1, -2, -3, etc usage and find is very > understandable. I haven't heard any complaints from Perl users on > this usage. I can imagine why Perl users wouldn't complain about this, but as an OCaml user, I think I will :-). I can see that there would be no problem if the index is written explicitly as -1, -2, -3, etc. However, I don't like the idea because it essentially overrides (to some extent) the very useful array bounds checking that OCaml provides, and could `hide' bugs when variables are used as the index. Maybe it's just me, but I often (er, at least sometimes?) find myself writing code like: let index = x - y + i in do something with (Array.get array index) where x, y and i move around in a complex way, and the problem is that I sometimes forget to do +1 somewhere, and end up getting a negative value as index (but is found easily since it raises an exception!). So I would very much like to have the function which doesn't allow negative indices to be kept (preferably leaving the name as [get], so as not to confuse users of standard arrays). If we must have a [get] with negative indices, it should be a different function. What do you think? Regards, -- Hideo Bannai <ba...@im...> |
From: Blair Z. <bl...@or...> - 2003-04-14 04:18:55
|
Hideo Bannai wrote: > > Hello list, > > At Sun, 13 Apr 2003 19:04:17 -0700, > Blair Zajac wrote: > > > > Nicolas Cannasse wrote: > > > > > > If (-1) make some sense, I'm not so sure about (-2) and (-3) :-) > > > So I would prefer an explicit get_last x that is perhaps less confusing to > > > code readers. > > > > I have no problem with the -1, -2, -3, etc usage and find is very > > understandable. I haven't heard any complaints from Perl users on > > this usage. > > I can imagine why Perl users wouldn't complain about this, but as > an OCaml user, I think I will :-). I can see that there would be no > problem if the index is written explicitly as -1, -2, -3, etc. > However, I don't like the idea because it essentially overrides > (to some extent) the very useful array bounds checking that OCaml > provides, and could `hide' bugs when variables are used as the index. > > Maybe it's just me, but I often (er, at least sometimes?) find myself > writing code like: > > let index = x - y + i in > do something with (Array.get array index) > > where x, y and i move around in a complex way, and the problem > is that I sometimes forget to do +1 somewhere, and end up getting a > negative value as index (but is found easily since it raises > an exception!). > > So I would very much like to have the function which doesn't allow > negative indices to be kept (preferably leaving the name as [get], so > as not to confuse users of standard arrays). If we must have a [get] with > negative indices, it should be a different function. > > What do you think? Sure. I don't feel strongly about this one. Let's have a separate function for indexing from the end. Best, Blair -- Blair Zajac <bl...@or...> Plots of your system's performance - http://www.orcaware.com/orca/ |
From: John M. S. <sk...@oz...> - 2003-04-14 06:55:16
|
> Sure. I don't feel strongly about this one. Let's have a separate > function for indexing from the end. No. If there is any doubt, leave it out. This operation is entirely unnecessary in Ocaml which is perfectly capable of doing index calculations and at the same time controlling scopes .. unlike some other languages :-) let last a = let n = size a in if n>0 then get a (n - 1) else raise IndexError let from_end a i = (* easy user space routine *) Easy routines like this which are not heavily used should NOT be put in libraries. IMHO :-) -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John M. S. <sk...@oz...> - 2003-04-14 06:44:45
|
Nicolas Cannasse wrote: >>- I haven't made the reallocation strategy explicit. And I'm not sure I >>like the shrinking strategy. >> I have a published article on this subject. The gist of it is: 1) the strategy is specified by a user supplied function with signature strategy: int -> int -> int -> int The first argument is the allocation size (total slots). The second argument is the number of used slots. The third arguument is the desired number of used slots. The result is the number of slots which should be present. 2) It is nice to provide a default function which has reasonable performance. A memory efficient algorithm is: let chunk = 16 let strategy a u d = if d > a - chunk then roundup(d,chunk)+chunk else if d < a + chunk then roundup(d,chunk)+chunk else u This basically adds or removes a fixed sized block WITH HYSTERESIS (like a thermostat). The semantics of use are: the physical allocator does NOT have to allocate the recommended amount. After calling the result must satisfy result >= 0 result >= d which is also a constraint on the physical allocator. An faster routine uses a doubling factor instead, but HYSTERESIS is vital to prevent thrashing. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Brian H. <bri...@ql...> - 2003-04-14 15:58:02
|
On Mon, 14 Apr 2003, John Max Skaller wrote: > 1) the strategy is specified by a user supplied function > with signature > > strategy: int -> int -> int -> int > I like it. Allows the user much better customization of allocation routines. > > 2) It is nice to provide a default function which has > reasonable performance. A memory efficient algorithm is: > [ snip- alogrithm that increases size in chunks ] I'm not sure Hysteresis is the best algorithm. Here's the problem. Assume that we double the array size every time we need to increase it's size, and start at an array size of 1. We then add n elements to an empty array- how much copying do we need to do? Well, when we add the second element, we need to copy one element to a new array (of length 2), when we add the third we need to copy two elements to a new array (of length 4), we add the fifth element, we need to copy 4 elements to a new array (of length 8), etc. So we end up copying 2^0 + 2^1 + 2^2 + 2^3 + 2^4 ... elements. If n is just 1 element shy of being, or is a power of 2, we have to copy n-1 elements total. If n is just over a power of two, we have to copy 2n - 1 elements total. In either case, to add n elements to an empty array requires O(n) elements being copied. Now, let's consider the same situation, adding n elements, but with a hysteresis algorithm. Every time we need to increase the array size, we increase the array size by a fixed amount- say c elements. To put numbers on it, lets assume c=10 and n=1000. The first 10 elements get added for free (no copying), but at element 11 we have to copy the first 10 elements. Then at element 21 we end up having to copy the first 20 elements, at element 31 the first 30 elements, and so on. The total number of elements we have to copy is 10+20+30+..., or to make the series more obvious, 10*(1+2+3+...). So using the simple forumla sum 1..n = n*(n+1)/2, we get: (n/c)*((n/c)+1) n * (n + c) 1 n^2 + (n * c) c* --------------- = c * ----------- * --- = --------------- 2 c 2 2 In other words, we have to copy O(n^2) elements to add n elements. In our example of n=1000 and c=10, we have to copy 49,500 elements. While in the double algorithm, since 1000 < 1024, we only have to copy 1023 elements, or 2.07% the work that Hysteresis requires. Going to 1025 elements requires doubling to reallocate and copy 1024 elements, bringing the total to 2047 elements copied. But hystersis requires an extra 3,030 copies- almost three times the work doubling requires. > An faster routine uses a doubling factor instead, > but HYSTERESIS is vital to prevent thrashing. You preventing thrashing by letting shrinkage go longer before shrinking. Here's the idea: you don't shrink the array until it falls below 25% capacity, and then you halve the size until the array is back up above 25% capacity (but it'll still be below 50% capacity). Thrashing is still a possibility, but only if the size is bouncing over a very wide range- at which point I question if it's really thrashing. Although this is why I like the idea of having a user definable reallocation strategy. Don't like the default? Write your own. Even better, I could see users writting custom reallocation strategies that "hook into" how the arrays are being used, for example: let my_realloc_strategy currsize used = if (my_algorithm_phase == heavy_allocation_and_deallocation) then (* Aggressizely upsize the array to avoid copying, and rarely * if ever downsize the array, I'll need those slots. Probably * use a doubling algorithm. *) else (* Conservatively upsize the array, but aggressively downsize. * Probably use a hystersis algorithm. *) ;; Brian |
From: John M. S. <sk...@oz...> - 2003-04-24 19:00:05
|
Brian Hurt wrote: > On Mon, 14 Apr 2003, John Max Skaller wrote: > I'm not sure Hysteresis is the best algorithm. [snip] > You preventing thrashing by letting shrinkage go longer before shrinking. Must be some misunderstanding here. Hysteresis doesn't imply adding a fixed chunk on growing, nor removing one on shrinking. It simply implies the threshholds for growth or shrinkage are not the same, but separated somehow so that a requirement for 1 more slot which causes an allocation, when followed by a requirement for 1 less slot, does NOT cause a reduction. The idea applies no matter whether growth is a constant chunk, or a fractional increase, or some other calculation: arrays grow when they must, but don't shrink until they waste a certain amount of space (which is more than the amount wasted after an allocation). This is what you described for the doubling routine: if you allow 25% wastage before shrinking, and you remove only 15% of the unused space when you do shrink, then small vibrations don't cause either allocations or deallocations. Anyhow, the point is not to use a naive routine where the growth and shrinkage triggers are the same, since a loop which adds an element and then removes one repeatedly would cause violent and unacceptable thrashing. And the real point is: there is no single optimal strategy, it depends on the use to which the array is being put. Hence, allowing the user to specify the strategy allows for the user to tune performance in a way reasonably well separated from the overall program logic. In my system I had: (1) a standard strategy (chunks of 16 elements) (2) a global variable which was initialised to the standard strategy but which could be changed (3) a default constructor using the global strategy (4) a constructor accepting a strategy argument (5) a method for changing the strategy for any individual array In addition, my stratgies were objects which had some data, which could also be changed (eg: chunk strategy was parameterised by chunk size). The cost is a function pointer in each array, and some overhead calling it when necessary. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Brian H. <bri...@ql...> - 2003-04-24 20:16:26
|
On Fri, 25 Apr 2003, John Max Skaller wrote: > Brian Hurt wrote: > > > On Mon, 14 Apr 2003, John Max Skaller wrote: > > > I'm not sure Hysteresis is the best algorithm. > > > [snip] > > > You preventing thrashing by letting shrinkage go longer before shrinking. > > > Must be some misunderstanding here. Yes there was. I was misunderstanding you. My apologies. The good news was that in the current version of Xarray in the libraries, every time I allocate a new array I allow the caller to pass in what reallocation strategy to use. Neither of the two I have defined are what you are talking about- but new strategies can easily be added. I tried to document in xarray.mli how to write a resizer function. If the comments are clear, please let me know and I'll improve them. > And the real point is: there is no single optimal > strategy, it depends on the use to which the array > is being put. Hence, allowing the user to specify > the strategy allows for the user to tune performance > in a way reasonably well separated from the > overall program logic. I fully agree. I can even envision application-specific resizer functions, which "know" the behavior of the algorithm being used and can behave differently as the algorithm changes it's allocation behavior. Or implements heuristics to try and predict the algorithm. Etc. The doubling algorithm is the overall generally best algorithm that I can think of. Basically, adding and subtracting elements is O(1)- making the construction and use of large xarrays not implausible. > In my system I had: > > (1) a standard strategy (chunks of 16 elements) I'm using doubling. See above. > (2) a global variable which was initialised to the standard > strategy but which could be changed Hmm. Could be done with a reference. But this is a very non-ML sort of concept. > (3) a default constructor using the global strategy Got it. > (4) a constructor accepting a strategy argument Got it. > (5) a method for changing the strategy for any individual array I only allow new strategies on allocating new arrays. This include 'implicit' allocations like of_enum and copy. Although I suppose this could be added. > > In addition, my stratgies were objects which had some > data, which could also be changed (eg: chunk strategy > was parameterised by chunk size). Partial evaluation takes care of this. > > The cost is a function pointer in each array, > and some overhead calling it when necessary. > Already paying it. I call the resizer function (which tells me what size the array should be) every time I add or remove an element from the array. Brian |
From: John M. S. <sk...@oz...> - 2003-04-25 06:52:50
|
Brian Hurt wrote: > The doubling algorithm is the overall generally best algorithm that I can > think of. Basically, adding and subtracting elements is O(1)- making the > construction and use of large xarrays not implausible. By best you mean 'fastest'. This is clearly not the only requirement. For example it would be a disaster on a prime number generator in which minimisation of storage use is more important. Here the time to copy is irrelevant compared to the time computing primes, where the need to fit large numbers in storage is dominant. Obviously, doubling is less efficient speed wise than tripling. My experiments showed that, as you claim, a factor of 2 is close to optimal: multiplying by 10 is faster, but not much. OTOH a factor of 1.8 begins to slow the system down... on the tests I ran anyhow. > Hmm. Could be done with a reference. But this is a very non-ML sort of > concept. I agree: loss of reentrancy .. BUT provided the store is atomic it only affects performance not semantics if the variable is changed at random even by multiple threads. Setting a strategy for a whole program depending on what phase of execution it is up to is fairly useful. >>(5) a method for changing the strategy for any individual array >> > > I only allow new strategies on allocating new arrays. This include > 'implicit' allocations like of_enum and copy. Although I suppose this > could be added. It can also be left out until someone needs it (if unsure of utility LEAVE IT OUT :-) -------------------------------------------------------- one thing I found: I needed 3 arguments not 2: current slots, current used slots, required size Conceptually, the reason is: the ideal size may be influenced by the allocated size for example: Double when necessary, only shrink if the desired size is zero. The extra argument is necessary for asymmetric growth/shrinkage algorithms, so I recommend you add it: at present you cannot take 'unused space' into account since it cannot be computed. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Brian H. <bri...@ql...> - 2003-04-25 19:20:35
|
On Fri, 25 Apr 2003, John Max Skaller wrote: > By best you mean 'fastest'. Fastest in the most common cases, to be precise. With "reasonable" (for suitably lose definitions of reasonable) space wastage. With doubling you never waste more than about 75% of the space, and rarely waste 50%. This is actually not bad, compared to other data structures, if you consider the wasted space "infrastructure overhead". Consider a binary tree, where the data is stored only at the leaves. You will have about as many branch nodes as you have leaf nodes. Suddenly have the memory used in branch nodes are references to other branch nodes, not references to data- "wasted" space. Increasing the muliple decreases the number of copies you need to make, at the cost of greatly increasing the "wasted space" on average. Plus, powers of two are generally easier to allocate than powers of anything else. > > Hmm. Could be done with a reference. But this is a very non-ML sort of > > concept. > > I agree: loss of reentrancy .. BUT provided the store is atomic > it only affects performance not semantics if the variable is > changed at random even by multiple threads. Setting a strategy > for a whole program depending on what phase of execution it is up > to is fairly useful. I was just thinking not-functional. Not in that it wouldn't work, but in that it's not how most functional programming flows, if that makes sense. > -------------------------------------------------------- > one thing I found: I needed 3 arguments not 2: > > current slots, current used slots, required size The inputs xarray has to give the resizer algorithm are: - the current number of slots the array has - the current number of slots the array needs Note that if I'm increasing or decreasing the array size, I adjust the number of slots needed before passing it into the resizer. What the resizer returns should be the number of slots the array should have. Once the resizer calculates what size the array should be, the xarray code actually deals with allocating a new array and copying data over (if needed), etc. Maybe these parameters should have a better name. > Conceptually, the reason is: the ideal size > may be influenced by the allocated size for example: > Always. You need to return the current size if you don't want to reallocate the array. > > The extra argument is necessary for asymmetric growth/shrinkage > algorithms, so I recommend you add it: at present you cannot > take 'unused space' into account since it cannot be computed. > Hmm. Consider the following case. You have an array, using the doubling algorithm, with a size of 16 and a length (number of used elements) of 10. Now, you add an enumeration with 100 elements in it. Currently, the resizer function is called with a currsize of 16 (the current number of spaces in the array when it's called) and a length of 110 (how many elements need to fit into the array). Exponential_resizer then keeps doubling 16 until it gets a long enough array size- 128 in this case. It returns 128. If I understand you correctly, in this example you also want 10 (call it 'oldlength') passed in as well. Certainly not difficult. I just don't see what you get out of it. I suppose you could keep the same percentage of free space, and return 176 instead of just 128. If resizer returns anything except the current array size, a new array is allocated and the data copied over (hmm. Must remember to null out the old array after copying everything over). Should having a small amount of data make the algorithm more inclined to copy data over? Brian |
From: John M. S. <sk...@oz...> - 2003-04-26 15:30:48
|
Brian Hurt wrote: > On Fri, 25 Apr 2003, John Max Skaller wrote: > >>-------------------------------------------------------- >>one thing I found: I needed 3 arguments not 2: >> >>current slots, current used slots, required size >> > > The inputs xarray has to give the resizer algorithm are: > > - the current number of slots the array has > > - the current number of slots the array needs >>The extra argument is necessary for asymmetric growth/shrinkage >>algorithms, so I recommend you add it: at present you cannot >>take 'unused space' into account since it cannot be computed. >> >> > > Hmm. Consider the following case. You have an array, using the doubling > algorithm, with a size of 16 and a length (number of used elements) of 10. > Now, you add an enumeration with 100 elements in it. > > Currently, the resizer function is called with a currsize of 16 (the > current number of spaces in the array when it's called) and a length of > 110 (how many elements need to fit into the array). Exponential_resizer > then keeps doubling 16 until it gets a long enough array size- 128 in this > case. It returns 128. > > If I understand you correctly, in this example you also want 10 (call it > 'oldlength') passed in as well. Yes. > Certainly not difficult. I just don't > see what you get out of it. Suppose you have a rule: when you're increasing the number of elements in the array, you can resize it, but never when you're decreasing them. That might be useful, but it cannot be implemented unless you know the number of elements in the array -- i don't mean slots of store but actual occupied slots. For example: the number of slots is set to the number of elements on expansion. So for example adding 1 element at a time to 5 then subtracting, then adding again: data slots 0 0 1 1 2 2 3 3 4 4 5 5 4 5 3 5 2 5 1 5 0 5 1 1 2 2 3 3 4 4 5 5 The number of slots can shrink, but only when the number of used slots increases. Why would you want to do this? I have no idea. But it is a viable algorithm and it cannot be implemented without passing the number of used slots in .. a general rule says "pass the whole of the state in" and the state is charactized by both the physical slot count and the number of used slots. NOte that one could argue that passing in the size of a slot (in bytes) also makes sense, since a strategy may be based on tuning around memory amounts (Kbytes) due to the finiteness of this quantity, rather than the more abstract notion of a slot. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Brian H. <bri...@ql...> - 2003-04-26 22:05:04
|
On Sun, 27 Apr 2003, John Max Skaller wrote: > Brian Hurt wrote: > > > Certainly not difficult. I just don't > > see what you get out of it. > > > Suppose you have a rule: when you're increasing the number > of elements in the array, you can resize it, but never > when you're decreasing them. That might be useful, > but it cannot be implemented unless you know the number > of elements in the array -- i don't mean slots of store > but actual occupied slots. > > For example: the number of slots is set to the number > of elements on expansion. So for example adding 1 element > at a time to 5 then subtracting, then adding again: > > data slots > 0 0 > 1 1 > 2 2 > 3 3 > 4 4 > 5 5 > 4 5 > 3 5 > 2 5 > 1 5 > 0 5 > 1 1 > 2 2 > 3 3 > 4 4 > 5 5 Ah! Only reallocate when you start adding elements back in, but then shrink to a correct size. You'd have to save state that you had shrunk, but that's not hard. OK. Yes, for that you need to know if you're adding or removing slots. I was thinking you meant something like: data slots 0 0 1 1 2 2 3 3 2 3 1 3 0 0 <- shrink happened 1 1 2 2 ... etc. You have the shrink happening on the first insert. And that's actually not *that* strange of a resizing algorithm, especially if you know that deletes are "bursty". It saves copying, and may even save short-term memory usage (if the GC wouldn't get around to collecting the old array soon enough). > NOte that one could argue that passing > in the size of a slot (in bytes) also makes > sense, since a strategy may be based on tuning > around memory amounts (Kbytes) due to the finiteness > of this quantity, rather than the more abstract notion > of a slot. Especially in that I (the xarray module) don't know how large an 'a is. Most likely it's just a reference off somewhere to some other type. Which is one of the reasons adding a user-defined resizer function is usefull. If you do know (or at least can guess) how big of data structures you're kicking around, write a specific resizer for it. Or have a parameterizable resizing function, like I did with the step function. Does anyone mind if I use named parameters for the resizer? Use the type (for example): resizer: numslots:int -> oldlength:int -> newlength:int -> int ? Make it more obvious what the various arguments are. Brian |
From: John M. S. <sk...@oz...> - 2003-04-27 01:58:40
|
Brian Hurt wrote: > Especially in that I (the xarray module) don't know how large an 'a is. > Most likely it's just a reference off somewhere to some other type. Oh yeah .. my C/C++ experience lets me down here (many unboxed data types of different sizes). > Does anyone mind if I use named parameters for the resizer? Use the type > (for example): > resizer: numslots:int -> oldlength:int -> newlength:int -> int > ? > > Make it more obvious what the various arguments are. I agree, this is a case where named parameters are a good idea: usage of strategy function is going to be light (not much use), so the extra clutter of naming is small but the benefit of seeing those unfamiliar uses is correct is great. BTW: with respect to boxing: we cannot support the unboxed float array (etc) types the specialised Ocaml arrays can. Or can we? [Funny but once one gets down to optimisation, the familiar problems of C/C++ turn up again: smly, once one wants advanced name control, again the same problems, and solutions, begin to suggest themselves: C++ is messy because users want things that are convenient and efficient, which is not always the same thing as pure and clean..] -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |