Re: [Ocaml-lib-devel] New Library 1: XList

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Sat, 15 Mar 2003, Nicolas Cannasse wrote:

> > OK, The first of three libraries I'm submitting for comments:  XList.
> >
> > XList is a reimplementation of the Ocaml standard List library, except all
> > non-tail recursion has been removed, and instead I use Olivier Andrieu's
> > setcdr function.  This should be as fast as recursion for short lists, but
> > work correctly (shouldn't blow stack) for long lists.
> 
> Ok, thanks alot, few remarks :
> - the setcdr should of course be hidden in the ExtList.mli

Heck yeah! :-)

> - I switched you "length" implementation to the INRIA one ( using a non
> local "aux" function, since they must have had a good reason - performances
> perhaps -  to do so ) -- in general, I did make all local auxiliary
> functions being "global", if anyone could test the speed and tell if it is
> better this way...

Hmm.  If I had to guess, I'd guess that making them global slows things 
down, if it changes anything at all.  By making them internal, I'm 
encouraging the compiler to inline them.  And limiting their namespace.  I 
don't want to think about how many internal 'loop' functions I wrote :-).

> - replaced "failwith" & "invalid_argument" calls by corresponding ExtLib
> exceptions raise ( btw, I switched the name of the exception from
> No_such_element - quite obscure - to Empty_list : fully descriptive )

The code that does this is usually where I'm emulating the standard 
library.

> - about the programming style, removed all the ";;" (useless) and the
> indentation after a match .. with ( and added missings | for
> first-case-matching : this make the code more readable )

I'm not 100% sure of this, but don't care enough to argue.

> - used the let ... = function when it was possible

I'm still getting comfortable with the let ... = function format.  :-).  
In poking around in the produce assembly language, I can't see that it 
makes a difference.

> - while talking about performance, could someone do some test program for it
> ? I would like to test for example the improvement of inlining the setcdr
> calls in duplicate, append, etc.

Performance I'm less concerned with.  Having a test suite would be nice.  
Up until now, I've mainly been cutting&pasting into top-level ocaml, and 
throwing random data at it.  A more standard testing program would be 
nice.

> - for the fold tail recursive calls, I propose the naming of  tail_fold_left
> / right for tail recursives one ( since most of the time, you don't really
> need tail recursives calls )

Correctness counts for more with me than performance.  The thing I like 
about tail recursion is that it works no matter how huge the lists are.  
Failing on long lists is most likely to happen in production, in hard to 
replicate cases.  So I'd rather make the tail recursive versions default- 
and provide non-tail-recursive versions if needed.

The experimenting I did with setcdr and append indicated that setcdr was 
as fast, and maybe even faster, than the non-tail-recursive versions, 
while list reversing was about 30% slower.  So if I could replace a 
non-tail-recursive version with a setcdr version, I didn't see a need to 
supply an old form.

> - corrected a bug since fold_right' was recursively calling... fold_right :)
> - same : memq calling mem  , assq calling assoc , mem_assq calling mem_assoc
> ( too fast copy-past, Brian :=)

Typos R' us.

> - removed useless  begin...end blocks, corrected indentation at some points,
> remove useless parenthesis in multi pattern matching ( should cause the
> allocation of tuples - but I don't really know since the ocaml stdlib is
> using this style sometimes )

Commas cause tuples, parens only affect grouping.  In ocaml:
# 1,2;;
- : int * int = (1, 2)
# (1,2);;
- : int * int = (1, 2)

On grouping, by which I mean both parens and begin/end: I've been bitten 
once too often by incorrect associations.  One of my favorite obfuscated C 
tricks is:

#define TRUE (1 & 3 == 1)
#define FALSE (!TRUE)

If there's one bitch I have with Ocaml, it's all the shift/reduce
conflicts, which show up in the language as uncertain associations.  So I 
liberally add parens and begin/end blocks.  Because it doesn't slow 
anything down, and makes it easier to modify the code later.

> - added the "exception Different_list_size of string" ( to replace
> invalid_argument in iter2, map2, etc... )

This changes behavior of the standard library.  Which is "correct" is a 
different argument- this is no longer a silent change.

> - about your sort function, I would like to see what are the speed
> differences on sorting ~1000 items lists and ~1M items lists ( compared to
> the List.sort of the ocaml stdlib ) + switched the "cmp" argument to
> optional - default = compare ( consistent with the current ExtList )

Haven't done any timings here.

> - will somehow generate the mli and write the documentation ? I dislike the
> mathematical explainations of the ocaml stdlib documentation, I think a
> short description, with perhaps one sample for each function would be better
> for beginners, and then the specification can come ( order preserved,
> complexity, tail-rec, etc. )

I'll take a swing at these.

Brian