From: Brian H. <bri...@ql...> - 2003-03-17 18:58:13
|
On Sat, 15 Mar 2003, Nicolas Cannasse wrote: > > OK, The first of three libraries I'm submitting for comments: XList. > > > > XList is a reimplementation of the Ocaml standard List library, except all > > non-tail recursion has been removed, and instead I use Olivier Andrieu's > > setcdr function. This should be as fast as recursion for short lists, but > > work correctly (shouldn't blow stack) for long lists. > > Ok, thanks alot, few remarks : > - the setcdr should of course be hidden in the ExtList.mli Heck yeah! :-) > - I switched you "length" implementation to the INRIA one ( using a non > local "aux" function, since they must have had a good reason - performances > perhaps - to do so ) -- in general, I did make all local auxiliary > functions being "global", if anyone could test the speed and tell if it is > better this way... Hmm. If I had to guess, I'd guess that making them global slows things down, if it changes anything at all. By making them internal, I'm encouraging the compiler to inline them. And limiting their namespace. I don't want to think about how many internal 'loop' functions I wrote :-). > - replaced "failwith" & "invalid_argument" calls by corresponding ExtLib > exceptions raise ( btw, I switched the name of the exception from > No_such_element - quite obscure - to Empty_list : fully descriptive ) The code that does this is usually where I'm emulating the standard library. > - about the programming style, removed all the ";;" (useless) and the > indentation after a match .. with ( and added missings | for > first-case-matching : this make the code more readable ) I'm not 100% sure of this, but don't care enough to argue. > - used the let ... = function when it was possible I'm still getting comfortable with the let ... = function format. :-). In poking around in the produce assembly language, I can't see that it makes a difference. > - while talking about performance, could someone do some test program for it > ? I would like to test for example the improvement of inlining the setcdr > calls in duplicate, append, etc. Performance I'm less concerned with. Having a test suite would be nice. Up until now, I've mainly been cutting&pasting into top-level ocaml, and throwing random data at it. A more standard testing program would be nice. > - for the fold tail recursive calls, I propose the naming of tail_fold_left > / right for tail recursives one ( since most of the time, you don't really > need tail recursives calls ) Correctness counts for more with me than performance. The thing I like about tail recursion is that it works no matter how huge the lists are. Failing on long lists is most likely to happen in production, in hard to replicate cases. So I'd rather make the tail recursive versions default- and provide non-tail-recursive versions if needed. The experimenting I did with setcdr and append indicated that setcdr was as fast, and maybe even faster, than the non-tail-recursive versions, while list reversing was about 30% slower. So if I could replace a non-tail-recursive version with a setcdr version, I didn't see a need to supply an old form. > - corrected a bug since fold_right' was recursively calling... fold_right :) > - same : memq calling mem , assq calling assoc , mem_assq calling mem_assoc > ( too fast copy-past, Brian :=) Typos R' us. > - removed useless begin...end blocks, corrected indentation at some points, > remove useless parenthesis in multi pattern matching ( should cause the > allocation of tuples - but I don't really know since the ocaml stdlib is > using this style sometimes ) Commas cause tuples, parens only affect grouping. In ocaml: # 1,2;; - : int * int = (1, 2) # (1,2);; - : int * int = (1, 2) On grouping, by which I mean both parens and begin/end: I've been bitten once too often by incorrect associations. One of my favorite obfuscated C tricks is: #define TRUE (1 & 3 == 1) #define FALSE (!TRUE) If there's one bitch I have with Ocaml, it's all the shift/reduce conflicts, which show up in the language as uncertain associations. So I liberally add parens and begin/end blocks. Because it doesn't slow anything down, and makes it easier to modify the code later. > - added the "exception Different_list_size of string" ( to replace > invalid_argument in iter2, map2, etc... ) This changes behavior of the standard library. Which is "correct" is a different argument- this is no longer a silent change. > - about your sort function, I would like to see what are the speed > differences on sorting ~1000 items lists and ~1M items lists ( compared to > the List.sort of the ocaml stdlib ) + switched the "cmp" argument to > optional - default = compare ( consistent with the current ExtList ) Haven't done any timings here. > - will somehow generate the mli and write the documentation ? I dislike the > mathematical explainations of the ocaml stdlib documentation, I think a > short description, with perhaps one sample for each function would be better > for beginners, and then the specification can come ( order preserved, > complexity, tail-rec, etc. ) I'll take a swing at these. Brian |