You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(5) |
Oct
|
Nov
|
Dec
|
From: skaller <sk...@us...> - 2005-01-17 01:38:37
|
On Mon, 2005-01-17 at 11:37, Brian Hurt wrote: > My apologies for being silent for so long (life intervened). One > question on this subject: is there a reason we're not using OUnit? Dependencies. The test mechanism needs to be independent of everything that isn't either in standard Ocaml distro or Extlib. This problem would go away if GODI was standard but it isn't .. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Brian H. <bh...@sp...> - 2005-01-17 00:35:13
|
My apologies for being silent for so long (life intervened). One question on this subject: is there a reason we're not using OUnit? I have some minor annoyances with it (it needs findlib installed is the biggest one), but not enough to write my own. I'm about 1/2 way done with a module implementing Chris Okasaki's random access lists, if people are interested. For those who don't know, these are lists that work mostly like normal linked lists, except that getting the nth element is O(log N), and not O(N). Several other operations are similiarly more efficient. The reason I bring this up is that I'm writting the test suite in OUnit. Brian |
From: Janne H. <ja...@hy...> - 2005-01-15 22:10:14
|
> Looks ok to me. > Just quickly reviewed, not tested ;) OK, submitted. I made a tester for it, so it should work! Best regards, Janne |
From: Janne H. <ja...@hy...> - 2005-01-13 22:58:42
|
Hello everyone, anyone had a chance to look at this yet? It's been a while since I posted this and I'd like to submit. A little while ago there was discussion about the future of ExtLib. One small thing I'd like to see in ExtLib would be a way to format human readable numbers (-h in most GNU software, e.g., 1000000 => 1M etc). There was also some discussion about adding date and time functions. Did anything happen on this front? ciao, janne Janne Hellsten wrote: > Hopefully this is the last one for now. This is bigger than the > previous patch as it addresses more problems. With these changes, > BitSet passes all current testing suite tests. > > If you guys could take a look again..? I will commit after OK'd. is_set > change hasn't been discussed here, but I took the liberty to make that > change anyway. > > If you want more speed, Ocamlopt should be passed -noassert in the > Makefile to turn off assertion checks. This should be pretty safe, > since there doesn't seem to be much code using asserts anyway. > > Summary of changes: > > - Assertion guarded bget/bset/bfill/bblit functions > - is_set raises Negative_index if passed a negative index (Doesn't make > much sense to pass in negative indices, right?) > - Removed unite/intersect/differentiate/differentiate_sym > implementations and rewrote them using union/inter/diff/sym_diff. Might > make just slightly slower, but I guess it's better that they don't > overflow anymore. > - Small optimizations to union, diff etc functions (innerloops a few > instructions shorter) > > ciao, > janne > > > ------------------------------------------------------------------------ > > ? patch_BitSet.txt > ? patch_BitSet_2.txt > Index: extlib-dev/bitSet.ml > =================================================================== > RCS file: /cvsroot/ocaml-lib/extlib-dev/bitSet.ml,v > retrieving revision 1.13 > diff -u -r1.13 bitSet.ml > --- extlib-dev/bitSet.ml 27 Dec 2004 18:31:38 -0000 1.13 > +++ extlib-dev/bitSet.ml 27 Dec 2004 19:33:58 -0000 > @@ -21,11 +21,28 @@ > type intern > > let bcreate : int -> intern = Obj.magic String.create > -external bget : intern -> int -> int = "%string_unsafe_get" > -external bset : intern -> int -> int -> unit = "%string_unsafe_set" > +external fast_get : intern -> int -> int = "%string_unsafe_get" > +external fast_set : intern -> int -> int -> unit = "%string_unsafe_set" > external fast_bool : int -> bool = "%identity" > -let bblit : intern -> int -> intern -> int -> int -> unit = Obj.magic String.blit > -let bfill : intern -> int -> int -> int -> unit = Obj.magic String.fill > +let fast_blit : intern -> int -> intern -> int -> int -> unit = Obj.magic String.blit > +let fast_fill : intern -> int -> int -> int -> unit = Obj.magic String.fill > +let fast_length : intern -> int= Obj.magic String.length > + > +let bget s ndx = > + assert (ndx >= 0 && ndx < fast_length s); > + fast_get s ndx > + > +let bset s ndx v = > + assert (ndx >= 0 && ndx < fast_length s); > + fast_set s ndx v > + > +let bblit src srcoff dst dstoff len = > + assert (srcoff >= 0 && dstoff >= 0 && len >= 0); > + fast_blit src srcoff dst dstoff len > + > +let bfill dst start len c = > + assert (start >= 0 && len >= 0); > + fast_fill dst start len c > > exception Negative_index of string > > @@ -102,12 +119,13 @@ > | false -> unset t > > let is_set t x = > - let pos = x lsr log_int_size and delta = x land int_size in > - let size = t.len in > - if pos < size then > - fast_bool (((bget t.data pos) lsr delta) land 1) > - else > - false > + if x < 0 then error "is_set"; > + let pos = x lsr log_int_size and delta = x land int_size in > + let size = t.len in > + if pos < size then > + fast_bool (((bget t.data pos) lsr delta) land 1) > + else > + false > > > exception Break_int of int > @@ -216,44 +234,6 @@ > in > make 0 > > -let intersect t t' = > - for i = 0 to t.len - 1 do > - bset t.data i ((bget t.data i) land (bget t'.data i)) > - done > - > -let unite t t' = > - let size = t.len and size' = t'.len in > - let rec unite_loop = function > - | -1 -> () > - | i -> bset t.data i ((bget t.data i) lor (bget t'.data i)); > - unite_loop (i-1) in > - if size < size' then begin > - let b = bcreate size' in > - unite_loop (size'- 1); > - t.len <- size'; > - t.data <- b; > - end else > - unite_loop (size - 1) > - > -let differentiate t t' = > - for i = 0 to t.len - 1 do > - bset t.data i ((bget t.data i) land (lnot (bget t'.data i))) > - done > - > -let differentiate_sym t t' = > - let size = t.len and size' = t'.len in > - let rec diff_sym_loop = function > - | -1 -> () > - | i -> bset t.data i ((bget t.data i) lxor (bget t'.data i)); > - diff_sym_loop (i-1) in > - if size < size' then begin > - let b = bcreate size' in > - diff_sym_loop (size'- 1); > - t.len <- size'; > - t.data <- b; > - end else > - diff_sym_loop (size - 1) > - > let raw_create size = > let b = bcreate size in > bfill b 0 size 0; > @@ -263,9 +243,11 @@ > let max_size = max a.len b.len in > let d = raw_create max_size in > let sl = min a.len b.len in > + let abuf = a.data > + and bbuf = b.data in > (* Note: rest of the array is set to zero automatically *) > for i = 0 to sl-1 do > - bset d.data i ((bget a.data i) land (bget b.data i)) > + bset d.data i ((bget abuf i) land (bget bbuf i)) > done; > d > > @@ -274,8 +256,10 @@ > let union a b = > let d = if a.len > b.len then copy a else copy b in > let sl = min a.len b.len in > + let abuf = a.data > + and bbuf = b.data in > for i = 0 to sl-1 do > - bset d.data i ((bget a.data i) lor (bget b.data i)) > + bset d.data i ((bget abuf i) lor (bget bbuf i)) > done; > d > > @@ -284,8 +268,10 @@ > let buf = bcreate maxlen in > bblit a.data 0 buf 0 a.len; > let sl = min a.len b.len in > + let abuf = a.data > + and bbuf = b.data in > for i = 0 to sl-1 do > - bset buf i ((bget a.data i) land (lnot (bget b.data i))) > + bset buf i ((bget abuf i) land (lnot (bget bbuf i))) > done; > { data = buf; len = maxlen } > > @@ -295,7 +281,32 @@ > (* Copy larger (assumes missing bits are zero) *) > bblit (if a.len > b.len then a.data else b.data) 0 buf 0 maxlen; > let sl = min a.len b.len in > + let abuf = a.data > + and bbuf = b.data in > for i = 0 to sl-1 do > - bset buf i ((bget a.data i) lxor (bget b.data i)) > + bset buf i ((bget abuf i) lxor (bget bbuf i)) > done; > { data = buf; len = maxlen } > + > +(* TODO the following set operations can be made faster if you do the > + set operation in-place instead of taking a copy. But be careful > + when the sizes of the bitvector strings differ. *) > +let intersect t t' = > + let d = inter t t' in > + t.data <- d.data; > + t.len <- d.len > + > +let differentiate t t' = > + let d = diff t t' in > + t.data <- d.data; > + t.len <- d.len > + > +let unite t t' = > + let d = union t t' in > + t.data <- d.data; > + t.len <- d.len > + > +let differentiate_sym t t' = > + let d = sym_diff t t' in > + t.data <- d.data; > + t.len <- d.len |
From: Janne H. <ja...@hy...> - 2004-12-30 00:20:36
|
Umm yes, I forgot to add, that strangely it seems that ocamlopt is using byterun/obj.c even if it looks like it's only meant for bytecode. Already deleted the patched compiler though, so I can't go and re-check. ciao, janne skaller wrote: > On Thu, 2004-12-30 at 08:22, Janne Hellsten wrote: > >>Just for the record: >> >>I patched my O'Caml 3.08.2 byterun/obj.c by commenting out line 107 >> >> "if (sz == 0) return arg;" >> >>O'Caml bootstrap went without problems and the problem with >>non-tempfixed DynArray disappeared. > > > In native code too? :-> > > |
From: skaller <sk...@us...> - 2004-12-30 00:17:26
|
On Thu, 2004-12-30 at 08:22, Janne Hellsten wrote: > Just for the record: > > I patched my O'Caml 3.08.2 byterun/obj.c by commenting out line 107 > > "if (sz == 0) return arg;" > > O'Caml bootstrap went without problems and the problem with > non-tempfixed DynArray disappeared. In native code too? :-> -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Janne H. <ja...@hy...> - 2004-12-29 21:22:41
|
Just for the record: I patched my O'Caml 3.08.2 byterun/obj.c by commenting out line 107 "if (sz == 0) return arg;" O'Caml bootstrap went without problems and the problem with non-tempfixed DynArray disappeared. ciao, janne Nicolas Cannasse wrote: >>The following test case in the testing suite causes a segfault in >>garbage collection: > > > I reported it on Caml list. > Looks like the following is making crash too : > > let test() = > let a = Obj.new_block 0 0 in > let b = Obj.dup a in > Gc.major() > ;; > > I'm not sure I understand why, since "a" is a valid block, it looks like a > caml bug to me. > I'll fix the idup method on CVS until I got more informations on this. |
From: Nicolas C. <war...@fr...> - 2004-12-29 13:40:20
|
> The following test case in the testing suite causes a segfault in > garbage collection: I reported it on Caml list. Looks like the following is making crash too : let test() = let a = Obj.new_block 0 0 in let b = Obj.dup a in Gc.major() ;; I'm not sure I understand why, since "a" is a valid block, it looks like a caml bug to me. I'll fix the idup method on CVS until I got more informations on this. Nicolas |
From: Janne H. <ja...@hy...> - 2004-12-28 10:02:59
|
skaller wrote: > I'm too lazy to search the archives, but segfaults were reported > with Dynarray some time ago. Much more recently someone actually > found a problem with 0 length Dynarrays. Even DynArray.create makes 0 length DynArrays, so it would be quite surprising if this was the actual bug. On the other hand, DynArray.copy uses a magic "idup" which is used only in two places in the module. This might well be the culprit. >> I think this approach is perfectly reasonable >>since GC bugs should not be that frequent. > > > Hmmm. What *other* kinds of bugs do you expect to find?? So far I have found only one bug that manifests itself in the GC cycle. Here are the bugs I've found from BitSet: - BitSet.compare logic mistakes, didn't handle different bit vector lengths - BitSet.differentiate et al. read past the bit vector thus producing a random answer for some test cases. Since these were reads, they didn't cause problems in GC. These kinds of bugs may or may not segfault, usually they will not if they only read a few bytes off. - Inconsistent behaviour: BitSet.is_set should throw an exception for negative indices (since others do as well!) but it doesn't. In the future if a Date/Time modules get added into ExtLib and should there be any bugs, I doubt these will be related to GC. Same goes for path handling. > --test=mmm_aa_ttt > --exe=foobar Yep, been thinking about the same options. I'll probably add these during this week. ciao, janne |
From: skaller <sk...@us...> - 2004-12-28 00:59:28
|
On Tue, 2004-12-28 at 10:32, Janne Hellsten wrote: > skaller wrote: > > Nope -- someone else reported this problem, I just > > coded it into a test. > > Oh OK. Well thanks to the original reporter then :) > > I was not aware of this, so I've been thinking it's a new bug. I'm too lazy to search the archives, but segfaults were reported with Dynarray some time ago. Much more recently someone actually found a problem with 0 length Dynarrays. Dynarray has been around for a while, so it is possible this caused the original problem. This kind of bug is very hard to find, which is a good reason to get rid of unnecessary unsafe accesses and add assertions. IMHO. This should not impact production code where assertions are turned off, and unsafe accesses replace all safe ones. > > Hmm. If we had unit test level control in the test > > harness, we could add/not add Gc.major() after every > > test. > > Yep that's right. We could have it as a flag for Util.run_test. > However, this can also be done in a regression tester fashion by > isolating the test into a new test module (new file) and then adding the > Gc.major () call by hand. That's true. > I think this approach is perfectly reasonable > since GC bugs should not be that frequent. Hmmm. What *other* kinds of bugs do you expect to find?? > There is also a possibility (because we have your mktest) to compile a > fresh new executable out of each test. This way each test would be run > in a separate process and thus they would not interfere with each other. Yes, but the harness generator doesn't handle units (unit tests) only groups of them. > There are also other variations such as randomizing the order of test > execution inside a single process etc etc. Yes, and calling the test multiple times too... > > Ah, you added --module :) > I did not add --test=foo, since there's not much need for it yet. It seems unlikely there will be any 'systematic' naming of tests, so probably the option should be --test=mmm_aa_ttt i.e. specify the whole filename. That test gets added into the test set, which is empty initially if the option is specified. Hmmm.. a more likely option is --exe=foobar which names the output executable, that way you have several testers at once. In particular a standard one, and one you're using to check the tests you're working on. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Janne H. <ja...@hy...> - 2004-12-27 23:32:52
|
skaller wrote: > Nope -- someone else reported this problem, I just > coded it into a test. Oh OK. Well thanks to the original reporter then :) I was not aware of this, so I've been thinking it's a new bug. > Hmm. If we had unit test level control in the test > harness, we could add/not add Gc.major() after every > test. Yep that's right. We could have it as a flag for Util.run_test. However, this can also be done in a regression tester fashion by isolating the test into a new test module (new file) and then adding the Gc.major () call by hand. I think this approach is perfectly reasonable since GC bugs should not be that frequent. There is also a possibility (because we have your mktest) to compile a fresh new executable out of each test. This way each test would be run in a separate process and thus they would not interfere with each other. There are also other variations such as randomizing the order of test execution inside a single process etc etc. >>$ ./mktest --author=jh --module=DynArray >>$ ./extlib_test > > Ah, you added --module :) Yes! $ ./mktest --author=js --author=jh --module=DynArray --module=ExtString calculates an intersection of all modules DynArray and ExtString that have js or jh as the author. I did not add --test=foo, since there's not much need for it yet. ciao, janne |
From: skaller <sk...@us...> - 2004-12-27 23:19:34
|
On Tue, 2004-12-28 at 09:42, Janne Hellsten wrote: > The following test case in the testing suite causes a segfault in > garbage collection: > > --cut-- > (* NOTE this is a copy of Skaller's trivial DynArray test. Apparently > he hit the nail on the head with the first try, Nope -- someone else reported this problem, I just coded it into a test. >open DynArray > > let test_triv () = > let a = make 0 in (* ZERO here causes a segfault later in GC??? *) > let b = copy a in > assert (length a == 0); > assert (length b == 0); > Gc.major () > --cut-- Hmm. If we had unit test level control in the test harness, we could add/not add Gc.major() after every test. > $ ./mktest --author=jh --module=DynArray > $ ./extlib_test Ah, you added --module :) -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Janne H. <ja...@hy...> - 2004-12-27 22:42:21
|
The following test case in the testing suite causes a segfault in garbage collection: --cut-- (* NOTE this is a copy of Skaller's trivial DynArray test. Apparently he hit the nail on the head with the first try, since test_triv causes a segfault if you run Gc.major () after test_triv. If you change the initial size of the DynArray to one or bigger, the crash does not occur. *) open DynArray let test_triv () = let a = make 0 in (* ZERO here causes a segfault later in GC??? *) let b = copy a in assert (length a == 0); assert (length b == 0); Gc.major () --cut-- It is enabled in the testing suite. If you want to run only this case, build mktest and run: $ ./mktest --author=jh --module=DynArray $ ./extlib_test Even though I didn't try, the crash should be reproducible outside of the testing suite. Tested with the latest CVS version of extlib. ciao, janne |
From: Janne H. <ja...@hy...> - 2004-12-27 19:41:41
|
Hopefully this is the last one for now. This is bigger than the previous patch as it addresses more problems. With these changes, BitSet passes all current testing suite tests. If you guys could take a look again..? I will commit after OK'd. is_set change hasn't been discussed here, but I took the liberty to make that change anyway. If you want more speed, Ocamlopt should be passed -noassert in the Makefile to turn off assertion checks. This should be pretty safe, since there doesn't seem to be much code using asserts anyway. Summary of changes: - Assertion guarded bget/bset/bfill/bblit functions - is_set raises Negative_index if passed a negative index (Doesn't make much sense to pass in negative indices, right?) - Removed unite/intersect/differentiate/differentiate_sym implementations and rewrote them using union/inter/diff/sym_diff. Might make just slightly slower, but I guess it's better that they don't overflow anymore. - Small optimizations to union, diff etc functions (innerloops a few instructions shorter) ciao, janne |
From: Bardur A. <oca...@sc...> - 2004-12-27 17:38:52
|
On Mon, Dec 27, 2004 at 06:07:36PM +0200, Janne Hellsten wrote: > I wonder if anyone had a chance to look at this? I'd like to submit > this before I embark on other bugfixes on the same module. I'm a little > hesitant, since I'm new in ExtLib town. > > Summary for the patch: fixes the completely broken BitSet.compare. Has > been tested relatively well with ExtLib testing suite. > I haven't looked at it in detail, but it looks reasonable enough to me... -- Bardur Arantsson <ba...@im...> <ba...@sc...> - We do try to accomodate our customers, but not being a hotel we find it almost impossible. Model Shop Clerk | A Bit of Fry and Laurie |
From: Nicolas C. <war...@fr...> - 2004-12-27 17:25:53
|
> Here are the assertion changes made to BitSet.ml: [...] Looks ok to me. I just wanted to be sure that you checked that ocamlopt would inline it correctly. Nicolas |
From: Janne H. <ja...@hy...> - 2004-12-27 16:07:43
|
I wonder if anyone had a chance to look at this? I'd like to submit this before I embark on other bugfixes on the same module. I'm a little hesitant, since I'm new in ExtLib town. Summary for the patch: fixes the completely broken BitSet.compare. Has been tested relatively well with ExtLib testing suite. ciao, janne Janne Hellsten wrote: > Here is a patch that should fix BitSet.compare bugs. I made a > relatively extensive tester for it and couldn't find any problems. > > The compare now works as if the bitvector was a BigInt. I am guessing > that that has been the original intent as well. > > Perhaps someone can take a look before I commit the changes? > > ciao, > janne > >> Compare seems completely broken to me too, at least if one >> expects it to work like you suggest (which is IMHO the >> correct way). >> It doesn't seem to properly account for the two bitsets >> possibly being of different lengths, it just starts >> comparing at the MSB even if the difference between the >> lengths of the sets is more than a byte. It gets more >> complicated because you can't even use |t1|>|t2| as an >> indication that t1>t2, because there is AFAICT no >> guarantee that the leading bit (or even just one bit in >> the first byte of the representation!) is set. > > > ------------------------------------------------------------------------ > > Index: bitSet.ml > =================================================================== > RCS file: /cvsroot/ocaml-lib/extlib-dev/bitSet.ml,v > retrieving revision 1.12 > diff -u -r1.12 bitSet.ml > --- bitSet.ml 22 Dec 2004 07:59:46 -0000 1.12 > +++ bitSet.ml 22 Dec 2004 21:52:46 -0000 > @@ -109,40 +109,56 @@ > else > false > > -(* we can't use Pervasives.compare because bitsets might be of different > - sizes but are actually the same integer *) > + > +exception Break_int of int > + > +(* Find highest set element or raise Not_found *) > +let find_msb t = > + (* Find highest set bit in a byte. Does not work with zero. *) > + let byte_msb b = > + assert (b <> 0); > + let rec loop n = > + if b land (1 lsl n) = 0 then > + loop (n-1) > + else n in > + loop 7 in > + let n = t.len - 1 > + and buf = t.data in > + try > + for i = n downto 0 do > + let byte = bget buf i in > + if byte <> 0 then raise (Break_int ((i lsl log_int_size)+(byte_msb byte))) > + done; > + raise Not_found > + with > + Break_int n -> n > + | _ -> raise Not_found > + > let compare t1 t2 = > - let size1 = t1.len and size2 = t2.len in > - let size = (if size1 < size2 then size1 else size2) in > - let rec loop2 n = > - if n >= size2 then > - 0 > - else if bget t2.data n <> 0 then > - 1 > - else > - loop2 (n+1) > - in > - let rec loop1 n = > - if n >= size1 then > - 0 > - else if bget t1.data n <> 0 then > - -1 > - else > - loop1 (n+1) > - in > - let rec loop n = > - if n = size then > - (if size1 > size2 then loop1 n else loop2 n) > - else > - let d = bget t2.data n - bget t1.data n in > - if d = 0 then > - loop (n+1) > - else if d < 0 then > - -1 > - else > - 1 > - in > - loop 0 > + let some_msb b = try Some (find_msb b) with Not_found -> None in > + match (some_msb t1, some_msb t2) with > + (None, Some _) -> -1 (* 0-y -> -1 *) > + | (Some _, None) -> 1 (* x-0 -> 1 *) > + | (None, None) -> 0 (* 0-0 -> 0 *) > + | (Some a, Some b) -> (* x-y *) > + if a < b then -1 > + else if a > b then 1 > + else > + begin > + (* MSBs differ, we need to scan arrays until we find a > + difference *) > + let ndx = a lsr log_int_size in > + assert (ndx < t1.len && ndx < t2.len); > + try > + for i = ndx downto 0 do > + let b1 = bget t1.data i > + and b2 = bget t2.data i in > + if b1 <> b2 then raise (Break_int (compare b1 b2)) > + done; > + 0 > + with > + Break_int res -> res > + end > > let equals t1 t2 = > compare t1 t2 = 0 |
From: Janne H. <ja...@hy...> - 2004-12-27 15:57:23
|
Sorry about that previous broken mail.. Had a little accident with my mouse. Nicolas Cannasse wrote: > It's not so easy. > Let's take a blit for example, you might not want to put assert before every > blit call so you will write a blit stub that does asserts before directly > calling blit, and I'm not sure ocamlopt can inline such method easily. > We should not modify the source code of ExtLib for testing, since testing > purpose is actually to detect wrong behavior. Having separate tests is ok > for me, but if we need to add also tests inside extLib sources (assert) that > make twice the same somehow. If there is some overflow problems we can > detect them with appropriate testing (for example run a GC major after every > call ). I have to disagree here. Tests are written to find bugs. In order to make the tests more useful, they should also be written such that they make it more easy to actually fix the bugs. Assertions have proven themselves in practice to do just that: to make it more easy locate and fix bugs. Debugging with asserts is easy, since you can isolate your bug into a small region of code. The same cannot be said about getting a random segfault in a random function when executing some totally unrelated code. If properly used, assertions do not incur a performance cost. As you suggested, small wrapper functions go extremely well with assertions. A compiler shouldn't have much of a problem inlining such wrapper functions, since with -noassert (assertions turned off) these wrappers will only contain a jump to the "wrappee" function. (See more details below) I am not proposing big changes to the library. What I'm suggesting is that if we find a problematic piece of code (such as the current BitSet in CVS), we could minefield a couple strategic points with assertions and leave them there. This would allow us to check that there are no buffer overruns in the code. These checks would only get compiled in "debug build". In all other occasions they would be turned off and thus shouldn't cause any overhead. I also think that any new code should be allowed to have assertions in it. This suggests that -noassert should be in the Makefile, at least if you want these checks to be turned off. *** Since earlier in my post I am claiming that assertions do not incur a run-time cost, I made a test to provide some evidence that this is indeed the case. I have an example from BitSet.ml. In BitSet, there exists the following loop (which reads out-of-bounds): --cut-- let intersect t t' = for i = 0 to t.len - 1 do bset t.data i ((bget t.data i) land (bget t'.data i)) done --cut-- The innerloop of the above when compiled with ocamlopt for x86: --cut-- .L548: movl %ecx, %edx sarl $1, %edx movl %edx, 0(%esp) movl (%eax), %edi movl %ecx, %esi sarl $1, %esi movl (%ebx), %edx movzbl (%edx, %esi), %edx lea 1(%edx, %edx), %esi movl %ecx, %ebp sarl $1, %ebp movl (%eax), %edx movzbl (%edx, %ebp), %edx lea 1(%edx, %edx), %edx andl %esi, %edx sarl $1, %edx movl 0(%esp), %esi movb %dl, (%edi, %esi) movl %ecx, %esi addl $2, %ecx movl 4(%esp), %edx cmpl %edx, %esi jne .L548 --cut-- As expected, there are no bound checks. The next step I did was to change bget and bget functions so that they are actually wrappers for functions fast_get and fast_set. I added bounds check assertions to bget and bset. The changes are detailed in the patch below. Naturally, if we compile with assertions on, the bget/bsets are not inlined. Instead, they get turned into function calls: --cut-- .L588: movl 0(%esp), %eax movl (%eax), %eax call camlBitSet__bget_65 .L590: movl %eax, 4(%esp) movl 8(%esp), %eax movl (%eax), %eax movl 12(%esp), %ebx call camlBitSet__bget_65 .L591: movl %eax, %ecx movl 4(%esp), %ebx andl %ebx, %ecx movl 8(%esp), %ebx movl (%ebx), %eax movl 12(%esp), %ebx call camlBitSet__bset_68 .L592: movl 12(%esp), %ebx movl %ebx, %ecx addl $2, %ebx movl %ebx, 12(%esp) movl 16(%esp), %eax cmpl %eax, %ecx jne .L588 --cut-- Again, this was expected. Now, we turn off assertions by adding -noassert to ocamlopt command line and compile the code with wrapped bget/bset. We get the optimized loop back: --cut-- .L553: movl (%eax), %edi movl (%ebx), %edx movl %esi, %ecx sarl $1, %ecx movzbl (%edx, %ecx), %ecx lea 1(%ecx, %ecx), %ecx movl (%eax), %ebp movl %esi, %edx sarl $1, %edx movzbl (%ebp, %edx), %edx lea 1(%edx, %edx), %edx andl %ecx, %edx movl %esi, %ecx sarl $1, %ecx sarl $1, %edx movb %dl, (%edi, %ecx) movl %esi, %edx addl $2, %esi movl 0(%esp), %ecx cmpl %ecx, %edx jne .L553 --cut-- You can do a little better than that by changing "intersect" function by reading t/t'.data pointers into local variables outside of the loop: --cut-- let intersect t t' = let buf = t.data and buf' = t'.data in for i = 0 to t.len - 1 do bset buf i ((bget buf i) land (bget buf' i)) done --cut-- Which yields: --cut-- .L553: movl %ebx, %eax sarl $1, %eax movzbl (%edi, %eax), %eax lea 1(%eax, %eax), %eax movl %ebx, %edx sarl $1, %edx movzbl (%esi, %edx), %edx lea 1(%edx, %edx), %edx andl %eax, %edx movl %ebx, %eax sarl $1, %eax sarl $1, %edx movb %dl, (%esi, %eax) movl %ebx, %eax addl $2, %ebx cmpl %ecx, %eax jne .L553 --cut-- Well it still isn't all that hot, but I guess it's been already discussed and agreed that if you want more performance, you should use a library written more specifically for your purpose. Here are the assertion changes made to BitSet.ml: <patch> diff -u -r1.12 bitSet.ml --- bitSet.ml 22 Dec 2004 07:59:46 -0000 1.12 +++ bitSet.ml 27 Dec 2004 14:42:26 -0000 @@ -21,11 +21,27 @@ type intern let bcreate : int -> intern = Obj.magic String.create -external bget : intern -> int -> int = "%string_unsafe_get" -external bset : intern -> int -> int -> unit = "%string_unsafe_set" +external fast_get : intern -> int -> int = "%string_unsafe_get" +external fast_set : intern -> int -> int -> unit = "%string_unsafe_set" external fast_bool : int -> bool = "%identity" -let bblit : intern -> int -> intern -> int -> int -> unit = Obj.magic String.blit -let bfill : intern -> int -> int -> int -> unit = Obj.magic String.fill +let fast_blit : intern -> int -> intern -> int -> int -> unit = Obj.magic String.blit +let fast_fill : intern -> int -> int -> int -> unit = Obj.magic String.fill +let fast_length : intern -> int= Obj.magic String.length + +let bget s ndx = + assert (ndx >= 0 && ndx < fast_length s); + fast_get s ndx + +let bset s ndx v = + assert (ndx >= 0 && ndx < fast_length s); + fast_set s ndx v + +let bblit src srcoff dst dstoff len = + fast_blit src srcoff dst dstoff len + +let bfill dst start len c = + fast_fill dst start len c + </patch> Best regards, Janne |
From: Janne H. <ja...@hy...> - 2004-12-27 15:53:56
|
make sure |
From: skaller <sk...@us...> - 2004-12-24 14:13:22
|
On Fri, 2004-12-24 at 22:32, Bardur Arantsson wrote: > On Fri, Dec 24, 2004 at 10:05:10PM +1100, skaller wrote: > > > On Fri, 2004-12-24 at 11:31, Janne Hellsten wrote: > > > > > Since I guess people are not willing to stop using unsafe_get/put (at > > > least not completely), we should think of some other way to achieve > > > bounds checking so that it doesn't hurt performance. > > > > Why doesn't just the usual get/put and compilation with -unsafe > > for a production build work? I mean that's the idea of -unsafe > > isn't it? > > Well, it probably works, but it requires even more > confidence that the code is correct, because suddenly > *every* array/string access becomes unsafe. > I don't think I have that much confidence in the extlib > code right now... The point is that the change needs to be made to test it .. and gain that confidence. I think you should be able to run a version with the checks still in place if you want. > I think this could be implemented by using pa_macro Yes, but I think we should avoid camlp4. This is supposed to be a core library. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Nicolas C. <war...@fr...> - 2004-12-24 12:01:17
|
> How would people feel about having release and debug builds of ExtLib? > The release would have assertions turned off. This is basically the > same as the current release version. The debug build would always have > assertions on and thus be slower. However, it would only be used for > testing purposes. It's not so easy. Let's take a blit for example, you might not want to put assert before every blit call so you will write a blit stub that does asserts before directly calling blit, and I'm not sure ocamlopt can inline such method easily. We should not modify the source code of ExtLib for testing, since testing purpose is actually to detect wrong behavior. Having separate tests is ok for me, but if we need to add also tests inside extLib sources (assert) that make twice the same somehow. If there is some overflow problems we can detect them with appropriate testing (for example run a GC major after every call ). Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-12-24 11:32:16
|
On Fri, Dec 24, 2004 at 10:05:10PM +1100, skaller wrote: > On Fri, 2004-12-24 at 11:31, Janne Hellsten wrote: > > > Since I guess people are not willing to stop using unsafe_get/put (at > > least not completely), we should think of some other way to achieve > > bounds checking so that it doesn't hurt performance. > > Why doesn't just the usual get/put and compilation with -unsafe > for a production build work? I mean that's the idea of -unsafe > isn't it? Well, it probably works, but it requires even more confidence that the code is correct, because suddenly *every* array/string access becomes unsafe. I don't think I have that much confidence in the extlib code right now... [--snip--] > >The debug build would always have > >assertions on and thus be slower. However, it would only be used for > >testing purposes. I think this could be implemented by using pa_macro and adding something like (syntax might be slightly wrong, I usually use revised syntax) IFDEF DEBUG module String = struct include String val unsafe_get = get val unsafe_set = set end module Array = struct include Array val unsafe_get = ... ... end ENDIF at the beginning of modules using unsafe_get/set... (You can also use the pa_macro INCLUDE directive, add the above to a file and include that in the appropriate places). This should allow you to enable/disable bounds checking at compilation time. -- Bardur Arantsson <ba...@im...> <ba...@sc...> The best part? I became an ordained minister while not wearing pants. |
From: skaller <sk...@us...> - 2004-12-24 11:05:18
|
On Fri, 2004-12-24 at 11:31, Janne Hellsten wrote: > Since I guess people are not willing to stop using unsafe_get/put (at > least not completely), we should think of some other way to achieve > bounds checking so that it doesn't hurt performance. Why doesn't just the usual get/put and compilation with -unsafe for a production build work? I mean that's the idea of -unsafe isn't it? My guess is that bytecode should always do the check, and native code never. Bytecode is slower, however the bounds check for bytecode is probably *relatively* faster (i.e. a smaller fraction of the access time than for native code). >The debug build would always have >assertions on and thus be slower. However, it would only be used for >testing purposes. Hmm. Sometimes, functions have preconditions which assertions can check. If the assertion is in the function, and debugging is off in a production build, a *client error* can't be detected. This kind of check isn't an internal consistency check.. in theory, the precondition should be checked in the client code -- I'm planning this for Felix, but most programming systems don't support precondition checking in the calling code, only in the called code. So I would question whether the debugging version shouldn't also be available to end users... Why not change the Extlib build to make both? Then the client can switch between debug and production by changing libraries on the command line? -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |
From: Janne H. <ja...@hy...> - 2004-12-24 00:31:51
|
Bardur Arantsson wrote: > Generally I would say one should always compile without > -unsafe and use unsafe_get and companions in those > instances where it can be *guaranteed* to stay within the > bounds *AND* where it matters for performance. > > Even in cases where it is easy to prove that the index is > always within bounds, I wouldn't use unsafe_get/set unless > it actually matters a great deal for performance. It > reduces readability and makes code harder to modify later. unsafe_get/set have one problem w.r.t. testing: when they're used, it is impossible to turn *on* bounds checking. When using .()/.[] and compiling with -unsafe, it is at least possible to turn on bounds checking when testing. Having integrated more tests into the extlib testing suite, I have come to the conclusion that bounds checking would be really nice during testing. The current situation is that ExtLib seems to be buffer overflowing in a few places, causing a segfault somewhere when testing ExtString module. This happens with just a handful of test cases, so I'm guessing that there are more problems like this waiting to be uncovered. As most parts of ExtLib are using unsafe_get/set, it is impossible to catch buffer under/overflows in places where they actually happen. As an example, a buffer overflow somewhere can cause an infinite loop inside O'Caml's garbage collector (happens currently in native code). It would be really nice to be able to catch cases like these *before* they cause any damage. Bounds checking does exactly this. Since I guess people are not willing to stop using unsafe_get/put (at least not completely), we should think of some other way to achieve bounds checking so that it doesn't hurt performance. I'm proposing the use of asserts to check bounds when accessing arrays and other data with unsafe methods. We could turn asserts off when building the release library, but we could always have asserts turned on when running the testing suite. This way we would catch possible buffer overruns immediately and not when their damage shows up elsewhere. How would people feel about having release and debug builds of ExtLib? The release would have assertions turned off. This is basically the same as the current release version. The debug build would always have assertions on and thus be slower. However, it would only be used for testing purposes. Merry Christmas, Janne |
From: skaller <sk...@us...> - 2004-12-23 04:42:46
|
On Thu, 2004-12-23 at 04:35, Janne Hellsten wrote: I have committed 'mktest.ml' to the extlib-test repository. This program is similar to the one initially posted to this list, except it is a bit cleaned up and: > 1. Building in tmp directory (extlib-test/build-tmp). Done. > 2. Remove the .tst extension and just use .ml. Done. Also, mktest program accepts options: --author=xxxx If at least one such option is found, the set of tests is filtered on the set of authors, so you can selectively build tests from all, one, or some set of test authors. The program prints the list of modules and tests. To try out this program do this: ocamlc -o mktest unix.cma str.cma mktest.ml && \ ./mktest --author=js --author=jh && \ ./extlib_test This should create: (a) a directory 'build-tmp' -- with permissions 0o777 (no idea what to use but it worked on my box, please fix it if you know better -- this is the only use of module Unix, I switched to Sys everywhere else for portability (b) stuff in build-tmp (c) A top level program 'extlib_test' It shouldn't touch anything else in the extlib-test directory. I have NOT changed any test names to conform to the required format: test_<author>_<module>_<test>.ml noting again that the <test> suffix is mandatory. I suggest authors either number their tests with three digits, or use a name suggestive of the kind of test. The need for the suffix could be relaxed but I have not done it. I have NOT changed anything else: for example the Makefile and other files are left untouched. If the harness is acceptable some files will need to be removed from the repository and others renamed. I was thinking that if the 'author' option is specified, the executable should perhaps have a different name .. but at present it doesn't. Note that even with the author specified you get a thunk for every module, even if there are no tests for that module. Also note there isn't any proper checking -- tests with module names that don't match an actual ExtLib module may cause problems -- I think there should be a warning, what actually happens is that they're compiled and linked but never called. It probably makes sense to be able to filter on the module too, I think that can be done in a couple of minutes, but I thought I would commit it now so people can have a look. Also, if we agree on the test naming protocol and also the requirement for a test to be properly called by the harness, it should be documented in a README.html file or something, which goes in the repository so people know how to construct tests. Finally -- I haven't bothered to add licence to the mktest.ml file, that probably needs to be done. BTW: on licencing, it may be worth everyone *assiging copyright* to Nicolas so any decisions about licencing policy can be executed by one person. For example I could ask for permission to include ExtLib in Felix which uses a BSD like licence, and if most people agreed with that Nicolas could execute the group decision legally if he was the sole copyright holder. Otherwise everyone who made a contribution would have to give permission.. (and one person who has gone away would thwart the group's choice). Note INRIA may change the Ocaml licence, which is a scenario in which the Extlib licence should probably be changed to match. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |