You can subscribe to this list here.
2003 |
Jan
|
Feb
(81) |
Mar
(97) |
Apr
(88) |
May
(80) |
Jun
(170) |
Jul
(9) |
Aug
|
Sep
(18) |
Oct
(58) |
Nov
(19) |
Dec
(7) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(22) |
Feb
(9) |
Mar
(28) |
Apr
(164) |
May
(186) |
Jun
(101) |
Jul
(143) |
Aug
(387) |
Sep
(69) |
Oct
(14) |
Nov
(8) |
Dec
(99) |
2005 |
Jan
(10) |
Feb
(34) |
Mar
(24) |
Apr
(7) |
May
(41) |
Jun
(20) |
Jul
(3) |
Aug
(23) |
Sep
(2) |
Oct
(26) |
Nov
(41) |
Dec
(7) |
2006 |
Jan
(6) |
Feb
(3) |
Mar
(11) |
Apr
|
May
|
Jun
(5) |
Jul
(8) |
Aug
(20) |
Sep
|
Oct
(6) |
Nov
(5) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(3) |
May
(2) |
Jun
|
Jul
|
Aug
(1) |
Sep
(7) |
Oct
(6) |
Nov
(19) |
Dec
(11) |
2008 |
Jan
|
Feb
(7) |
Mar
(9) |
Apr
(21) |
May
(42) |
Jun
(27) |
Jul
(28) |
Aug
(26) |
Sep
(16) |
Oct
(32) |
Nov
(49) |
Dec
(65) |
2009 |
Jan
(35) |
Feb
(20) |
Mar
(36) |
Apr
(42) |
May
(111) |
Jun
(99) |
Jul
(70) |
Aug
(25) |
Sep
(15) |
Oct
(29) |
Nov
(3) |
Dec
(18) |
2010 |
Jan
(10) |
Feb
(4) |
Mar
(57) |
Apr
(63) |
May
(71) |
Jun
(64) |
Jul
(30) |
Aug
(49) |
Sep
(11) |
Oct
(4) |
Nov
(2) |
Dec
(3) |
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
(3) |
Mar
(6) |
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Yamagata Y. <yor...@mb...> - 2003-06-22 21:42:09
|
I find some problems in uChar.ml. (chr do not check the argument properly, and some functions raise the wrong exception.) The correct version is attached below. Also, I tried to optimize uTF.ml further, using unsafe operations. I actually do not like this kind of the trick, but it seems to squeeze about 30-40% speed up. Is there other things to do for the inclusion? -- Yamagata Yoriyuki |
From: John S. <sk...@oz...> - 2003-06-22 18:35:48
|
Some exception handling stuff that is related to the option module. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Yamagata Y. <yor...@mb...> - 2003-06-22 09:57:46
|
From: John Max Skaller <sk...@oz...> Subject: Re: [Ocaml-lib-devel] second proposal of UChar, UTF8 modules Date: Sun, 22 Jun 2003 09:39:52 +1000 > Still has the redundant tests here. Fixed. Thanks. The file is attached below. -- Yamagata Yoriyuki |
From: Michal M. <mal...@pl...> - 2003-06-22 07:52:09
|
On Sun, Jun 22, 2003 at 09:51:40AM +1000, John Max Skaller wrote: > Hmmm.. it was said a C implementation should use VALUE > macro not integer 0. A small C interface will make > that available as a universal NULL value? It doesn't have to NULL pointer, any other pointer should do. Using: let null = ref 0.0 seems to work (with Obj.magic). Using Obj.magic 0 to assign float array causes SEGV. But this null needs to be global, so it won't get GCed itself. -- : Michal Moskal :: http://www.kernel.pl/~malekith : GCS {C,UL}++++$ a? !tv : When in doubt, use brute force. -- Ken Thompson : {E-,w}-- {b++,e}>+++ h |
From: John M. S. <sk...@oz...> - 2003-06-22 00:58:25
|
Brian Hurt wrote: > On Sat, 21 Jun 2003, John Skaller wrote: >>The fast_count function must be elided: >>it is a symptom of the designer not being >>able to make up his mind exactly what >>the model is. >> >>This is non-trivial. The only way to count >>a filtered enumeration is eagerly, >>on the other hand next can be applied lazily. >> > > I think part of the problem is that enumeration is becoming a victim of > it's own success. We're starting to want to apply it in ways not > originally conceived of. It's a great way to "glue together" parts of a > program. But this is putting stress on the original model, as we keep > going "well, this situation is almost exactly like an enumeration, > except we need to be able to do this...". Yes, but it isn't really the model that is being stressed but its specification. It is OK to fix a specification that eliminates some uses, in fact it is necessary. The same problem had to be dealt with for C++ STL iterators. There, they have conceptual distinctions between: input iterator forward iterator bidirectional iterator random iterator Whilst input iterators can be copied, nothing is said about the currency or validity of any iterator than the one last incremented: the others are all invalidated by this operation. On the other hand forward iterator are required to hold their position. So: a file handle is an input iterator, since when you seek on it, the seek point of all copies changes. A file handle together with a file position is a forward iterator, since it always seeks from the place it last finished reading. In STL, algorithms specify requirements on iterators. Here, you're trying to package up the algorithms *with* the iterators [the beauty of STL is that containers and algorithms interact VIA iterators, and so iterators are used to factor container design and algorithm design] In doing that, you are more or less forced to decide which kind of iterator the algorithms will work on: if you choose input iterators, a count consumes the enumeration: it has to, and it also must be REQUIRED to: cheating with a fast transparent count can't be allowed: someone might actually rely on the enumeration being consumed, for example in an iter2 which ought to terminate immediately if you count one of the arguments. Input iterators are good for coinductive data structures like streams, where the next operation is destructive. Forward iterators are more useful in general, and almost always work fine with inductive data structures: although, for example, counting might consume an enumeration, you do the count on a clone of the iterator, so the copied iterator is not affected. > There are time we *need* a count, almost no matter how expensive it is. Yes. The question is: can we allow it to consume the enumeration? In the spec I committed, the answer is NO: the user supplied count is not permitted to consume the enumeration. Effectively, I'm specifying forward iterators. This rules out enumerations of coinductive data structures like streams. > All the places where I can see from being usefull, init as defined above > is just as usefull. I don't have a problem with this. But I'm not sure > it solves the basic problems. Sure it does. The problem is simply that you have to specify that count is non-consuming or consuming. If you cannot implement such a count, you cannot use the Enum module. No problem. You may need TWO enum modules: there are TWO kinds of iterator. For 'from', you can even leave it provided you make it eager: it must expand the function immediately to obtain the count. The point is that then the client *knows* by specification when the function is called: immediately. Otherwise, it is useless because you *cannot* count it without destroying the enumeration permanently, defeating the purpose. Clearly, delaying the expansion doesn't work. If you append two copies of the same function, what happens? When the expansion is forced, the function is called in arbitrary order. let seq = ref 0 in let g () = if !seq > 2000 raise No_value; incr seq; !seq in Enum.append (Enum.from g, Enum.from g) The constructed enum (at the moment) is an input iterator. Clearly any counting will skip some values of seq .. indeed, it will change the behaviour of subsequent calls. On the other hand, init whilst not necessarily producing a forward iterator, WILL produce one if the argument is a function (mathematically). So the interface is good, the onus is on the client to provide a purely functional argument. The reason I think counting should be required NOT to consume the enumeration is that it means counting can be made lazy. BUT you have to eliminate filters :-) count (add(f,g)) = add (count f, count g) is the requirement. Here, 'add' means 'add integers' in the count value category, and 'add enumerations' in the enumeration category (of type T): for map: count (map f) = map (count f) Here the RHS map is 'identity function'. count (filter f) = filter (count f) Woops! There is no possible RHS filter, because count has lost the values of f, and filter needs them. There is some category theory here about functors and natural transformations .. but the idea should be clear that count must commute with any constructor you use. Fold preserves additive structure :-) count (fold f) = fold (count f) The RHS fold is the function f x = 1 :-) Note this is true EVEN IF THE RESULT OF THE FOLD IS AN ENUMERATION. In particular, to make say an array from a lazy enumeration, you can count the enumeration, allocate the array, and provide an extraction function to initialise it, all without needing to expand the lazy data structure. It doesn't matter if this is slow or internally clumbsy: the client can improve performance by calling 'force' first. Otherwise they know the order next is called on their data structures (the order of elements in the array) So the rule would be, for forward iterators: force is only called internally when all the forced enumerations would be conumed anyhow. In that case, it is OK to force: note that forward iterators BY SPECIFICATION act independently. So the order of forcing subcomponents is irrelevant, what matters is that next is called until end on each subcomponent enumeration. Err... does that make sense? iter, iteri et all: may call force fold: must not count: must not ... I think the overall design is good. You only need to commit to a specification and the follow through the implications. The only real difficulty is that there is more than one possible specification. If that means two enum modules, so be it: don't try to make a single module do two distinct jobs. A module with 'count' intrinsic should be forward iterators. A module without count should be input iterators. You do not need count to make an array. You make a list, count that, and then make an array .. of course Dynarray might be better than a list or whatever. If there is duplicated code in these two modules, you can factor it out into a third algorithms module, and then include it. By the way, I think being able to lazily append input iterators and then force them, for example to concatenate two files, is very useful. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John M. S. <sk...@oz...> - 2003-06-21 23:52:08
|
Yamagata Yoriyuki wrote: > From: Alan Post <ap...@re...> > Subject: [Ocaml-lib-devel] Re: Array interface > Date: Fri, 20 Jun 2003 23:56:25 +0000 (UTC) > > >>What is under discussion is what to put in the "empty" slots of the >>newly doubled array, ocaml having no "null" or "undef" value to stuff >>in them. >> > > I do not understand the motive of Dynarray, Variable length array with automatic efficient resizing. It needs to allocate more store than the number of slots actually needed to minimise reallocation on extension of the array, and allow contraction without necessarily reallocating. So there are some slots of the underlying array that are unused. The problem is how to fool the garbage collector without a dummy object to put in the unused slots. The client may not be able to construct an object of the right type easily, and copying a real element may create references that prevent the collector removing an object -- unless the design carefully refills unused slots with the value of a used slot -- and destroys the array completely when no slots are used. Therefore, there is no really good design 'in ocaml': Dynarray is really a system primitive. Hmmm.. it was said a C implementation should use VALUE macro not integer 0. A small C interface will make that available as a universal NULL value? I guess this is a C null pointer: ie, an address the collector will leave alone, rather than an integer the collector will leave alone? -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John M. S. <sk...@oz...> - 2003-06-21 23:40:11
|
Yamagata Yoriyuki wrote: Still has the redundant tests here. > let add_uchar buf u = > let masq = 0b111111 in > let k = int_of_uchar u in > if k >= 0 && k <= 0x7f then > Buffer.add_char buf (Char.chr k) > else if k >= 0x80 && k <= 0x7ff then begin > Buffer.add_char buf (Char.chr (0xc0 lor (k lsr 6))); > Buffer.add_char buf (Char.chr (0x80 lor (k land masq))) > end else if k >= 0x800 && k <= 0xffff then begin > Buffer.add_char buf (Char.chr (0xe0 lor (k lsr 12))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 6) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor (k land masq))); > end else if k >= 0x10000 && k <= 0x1fffff then begin > Buffer.add_char buf (Char.chr (0xf0 + (k lsr 18))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 12) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 6) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor (k land masq))); > end else if k >= 0x200000 && k <= 0x3ffffff then begin > Buffer.add_char buf (Char.chr (0xf8 + (k lsr 24))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 18) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 12) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 6) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor (k land masq))); > end else begin > Buffer.add_char buf (Char.chr (0xfc + (k lsr 30))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 24) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 18) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 12) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor ((k lsr 6) land masq))); > Buffer.add_char buf (Char.chr (0x80 lor (k land masq))); > end -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Brian H. <bri...@ql...> - 2003-06-21 17:41:07
|
On Sat, 21 Jun 2003, John Skaller wrote: > > That isn't really acceptable, since the enumeration > is conceptually a model of a mutable object: > even though the enum module itself is functional, > it is being used to contruct mutator functions > which operate on a stateful object. I toyed with the idea of a applicative enumeration, and then realized something: while you can always wrap an applicative behavior with an imperitive behavior (using a reference if nothing else), you can not wrap an imperitive behavior with an applicative behavior. The idea I had was basically to make Enum.next look like: val next: 'a Enum.t -> 'a option * 'a Enum.t I.e. return the next object and the enumeration of the rest. Then I considered how would I write an enumeration of the lines of a file? > > The fast_count function must be elided: > it is a symptom of the designer not being > able to make up his mind exactly what > the model is. > > This is non-trivial. The only way to count > a filtered enumeration is eagerly, > on the other hand next can be applied lazily. I think part of the problem is that enumeration is becoming a victim of it's own success. We're starting to want to apply it in ways not originally conceived of. It's a great way to "glue together" parts of a program. But this is putting stress on the original model, as we keep going "well, this situation is almost exactly like an enumeration, except we need to be able to do this...". There are time we *need* a count, almost no matter how expensive it is. An example of this is ExtArray.of_enum. We need to count to know how large to make the array. In other situations, a count is usefull but not necessary. For example, if you're inserting an enumeration into the middle of a Dynarray, having a count is usefull- this way I only have to resize and move elements once. However, I can fake it without too much heartache if I don't have a count. I'll have to move things approximately log N times, but that's not too horrible. Maybe a way to help clear the air a little bit would be to list those places we might like to use enumerations, and the problems/demands those uses place on the enumeration interface. To get us started, here's my list: - enumerating a list. Counts are O(N) - reading file as an enumeration of strings (one per line). Don't know in advance how many lines are going to be read, imperitive semantics (next changes the state of the io channel). Clone is interesting. - creating an array from the enumeration. Count is necessary, alternative is creating a linked list. - insert an enumeration into the middle of a dynarray. Count helpful, alternatives not too bad. - creating tree data structures. Need an exported next, with a count I can build a balanced tree in O(N), without it's O(N log N). - filtering an enum- makes even easy O(1) counts (like arrays) into unknown counts. More? > The [from] function is the biggest hassle, > I'd consider removing it. The client can > always use > > init f limit > > which also might turn accidental infinite loops > into merely very long ones :-) All the places where I can see from being usefull, init as defined above is just as usefull. I don't have a problem with this. But I'm not sure it solves the basic problems. Brian |
From: Yamagata Y. <yor...@mb...> - 2003-06-21 08:48:37
|
From: Alan Post <ap...@re...> Subject: [Ocaml-lib-devel] Re: Array interface Date: Fri, 20 Jun 2003 23:56:25 +0000 (UTC) > What is under discussion is what to put in the "empty" slots of the > newly doubled array, ocaml having no "null" or "undef" value to stuff > in them. I do not understand the motive of Dynarray, so perhaps it is dumb to ask, but what is wrong with Markus's RES library? -- Yamagata Yoriyuki |
From: Yamagata Y. <yor...@mb...> - 2003-06-21 08:38:31
|
The second proposal of UChar and UTF8 modules are attached. The improvements are * Better documentation * Error reporting * Performance improvement * Code clean up. As before, I did some random tests. -- Yamagata Yoriyuki |
From: John M. S. <sk...@oz...> - 2003-06-21 04:30:43
|
Brian Hurt wrote: > On Fri, 20 Jun 2003, Nicolas Cannasse wrote: > This is one place where has_next would be usefull. But it is a significant change in semantics because it requires one element lookahead. Some data structures cannot support that without side effects, for example reading the next line of a file. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John M. S. <sk...@oz...> - 2003-06-21 04:27:31
|
Eric C. Cooper wrote: > On Fri, Jun 20, 2003 at 03:58:38PM -0500, Brian Hurt wrote: > I'm probably missing some essential context, but what is wrong with > the simple approach of using a mutable 'a array that gets replaced > with a bigger one when you try to set an element beyond its current > length? Nothing other than performance. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John M. S. <sk...@oz...> - 2003-06-21 04:25:18
|
Brian Hurt wrote: > First off, an apology. I was grumpy this morning about things completely > irrelevent to the list- and thus responded a lot more hotly than was > called for. Sorry. And thank you for being rational enough not to flame > in return. I wouldn't dare .. its usually ME that is the hot head: get in trouble for it all the time on the C++ committee reflector. > I'm not opposed to option 3. Heck, I'd even support an implemention > along those lines. I'm just unlikely to write it anytime soon. It should be done by the OCaml team. They'll get it right and ensure it stays synched with implementation changes. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John S. <sk...@oz...> - 2003-06-21 04:19:03
|
sh: /cvsroot/ocaml-lib/CVSROOT/commit.log: Permission denied Committed some changes to doco in enum.mli. Looks like a bug in sourceforge CVS setup, don't see how this is possible though... its heavily used by multiple developers. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John S. <sk...@oz...> - 2003-06-21 03:52:44
|
The current Enum module seems inconsistent. The interplay between force, count, etc make the point in time at which user supplied next function is called indeterminate. That isn't really acceptable, since the enumeration is conceptually a model of a mutable object: even though the enum module itself is functional, it is being used to contruct mutator functions which operate on a stateful object. The fast_count function must be elided: it is a symptom of the designer not being able to make up his mind exactly what the model is. This is non-trivial. The only way to count a filtered enumeration is eagerly, on the other hand next can be applied lazily. In particular: it must be specified not only if count consumes an emumeration, but also if it forces lazy operations. Both questions must have a definite answer. The same is probably true for all other operations, however in most cases the answer is obvious. The point in time a next function supplied to make is called is of vital interest to the client and must be predictable due to side effects on the object being enumerated. There are several possible ways to resolve this issue, it is a nasty problem, but expected since the entity being modelled is stateful, and the client is providing mutators to the Enum module. The [from] function is the biggest hassle, I'd consider removing it. The client can always use init f limit which also might turn accidental infinite loops into merely very long ones :-) -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Alan P. <ap...@re...> - 2003-06-20 23:56:39
|
In article <200...@ec...>, Eric C. Cooper wrote: > On Fri, Jun 20, 2003 at 05:29:26PM -0500, Brian Hurt wrote: >> On Fri, 20 Jun 2003, Eric C. Cooper wrote: >> > I'm probably missing some essential context, but what is wrong with >> > the simple approach of using a mutable 'a array that gets replaced >> > with a bigger one when you try to set an element beyond its current >> > length? >> >> Because then you are allocating a new array- and copying all the elements >> of the old array- every time you add a new element to the end of the >> array. So if you add n elements to an empty array, you end up allocating >> n arrays, and copying n*(n-1)/2 elements. At which point you might as >> well be using lists and @. > > If you double the array each time you need to grow it, you will > create only O(log n) arrays, and copy only O(n log n) elements. Do > any of the other approaches have better worst-case performance? What is under discussion is what to put in the "empty" slots of the newly doubled array, ocaml having no "null" or "undef" value to stuff in them. |
From: Eric C. C. <ec...@cm...> - 2003-06-20 23:45:06
|
On Fri, Jun 20, 2003 at 05:29:26PM -0500, Brian Hurt wrote: > On Fri, 20 Jun 2003, Eric C. Cooper wrote: > > I'm probably missing some essential context, but what is wrong with > > the simple approach of using a mutable 'a array that gets replaced > > with a bigger one when you try to set an element beyond its current > > length? > > Because then you are allocating a new array- and copying all the elements > of the old array- every time you add a new element to the end of the > array. So if you add n elements to an empty array, you end up allocating > n arrays, and copying n*(n-1)/2 elements. At which point you might as > well be using lists and @. If you double the array each time you need to grow it, you will create only O(log n) arrays, and copy only O(n log n) elements. Do any of the other approaches have better worst-case performance? -- Eric C. Cooper e c c @ c m u . e d u |
From: Michal M. <mal...@pl...> - 2003-06-20 22:31:24
|
On Fri, Jun 20, 2003 at 03:58:38PM -0500, Brian Hurt wrote: > > First off, an apology. I was grumpy this morning about things completely > irrelevent to the list- and thus responded a lot more hotly than was > called for. Sorry. And thank you for being rational enough not to flame > in return. > > Unless I'm missing something (which is possible), there are three possible > implementations of Dynarray that I can see working: > > 1) Have the internal array be a 'a option array, and not a 'a array as it > is currently. This means we are going through at least one, maybe 2 > references for every element, need it or not. If I understand things > correctly, in this case a dynarray of booleans takes up 3 words per entry, > and retrieving on takes two or three dereferences. I just did some mini-tests, and it seems that compiler uses one Some true and Some false value for all entries, i.e. one boolean takes one word in bool option array. However for ints it takes 3 words per array entry (if int's are all different, when setting all the array to Some 3, it took one word per entry). > 2) Use the null value kludge. I agree that this is far from optimal- but > it allows me to avoid options were not needed, and still write in Ocaml. Maybe use first value put into array as null value, and maybe provide way of setting null value as additional interface? This would have one drawback, that this first value might be complex object with references that would otherwise got GC-ed. > WRT to unboxed floats, I myself wouldn't be horrified. The only data > structure in which floats are unboxed (to my knowledge) is arrays. Records with only-floats fields also use unboxed floats. However my idea about unboxed floats in ocaml, is that floats are extensively used in benchmarks :-) -- : Michal Moskal :: http://www.kernel.pl/~malekith : GCS {C,UL}++++$ a? !tv : When in doubt, use brute force. -- Ken Thompson : {E-,w}-- {b++,e}>+++ h |
From: Brian H. <bri...@ql...> - 2003-06-20 22:08:52
|
On Fri, 20 Jun 2003, Eric C. Cooper wrote: > On Fri, Jun 20, 2003 at 03:58:38PM -0500, Brian Hurt wrote: > > Unless I'm missing something (which is possible), there are three possible > > implementations of Dynarray that I can see working: > > > > 1) Have the internal array be a 'a option array, and not a 'a array as it > > is currently. [...] > > 2) Use the null value kludge. [...] > > 3) Write in C, leaving floats unboxed. [...] > > I'm probably missing some essential context, but what is wrong with > the simple approach of using a mutable 'a array that gets replaced > with a bigger one when you try to set an element beyond its current > length? Because then you are allocating a new array- and copying all the elements of the old array- every time you add a new element to the end of the array. So if you add n elements to an empty array, you end up allocating n arrays, and copying n*(n-1)/2 elements. At which point you might as well be using lists and @. Brian |
From: Brian H. <bri...@ql...> - 2003-06-20 22:05:44
|
On Fri, 20 Jun 2003, Nicolas Cannasse wrote: > I added the complete documention to Enum. (comments are welcome ) > While reading the documentation specification, I found a bug. > Actually the iter2 , fold2 and others were not correct because of the > following > > f (t.next()) (u.next()) > > if t.next() was called and then u.next() was raising No_more_elements , we > were loosing an element of t. > I modified the implementation of theses so now the element is saved and > pushed back (using the new "push" function) if needed. > But as for map2 (and map2i) this would involve setting up a try .. catch > block for every call to next , which has already been tagged as > not-so-efficient , I have then temporay removed theses two. This is one place where has_next would be usefull. It would allow us to implement iter2 like: let iter2 f t u = let rec loop () = if (t.has_next()) && (u.has_next()) then begin f (t.next()) (u.next()); loop () end else () in loop () Brian |
From: Eric C. C. <ec...@cm...> - 2003-06-20 22:02:26
|
On Fri, Jun 20, 2003 at 03:58:38PM -0500, Brian Hurt wrote: > Unless I'm missing something (which is possible), there are three possible > implementations of Dynarray that I can see working: > > 1) Have the internal array be a 'a option array, and not a 'a array as it > is currently. [...] > 2) Use the null value kludge. [...] > 3) Write in C, leaving floats unboxed. [...] I'm probably missing some essential context, but what is wrong with the simple approach of using a mutable 'a array that gets replaced with a bigger one when you try to set an element beyond its current length? Something like the following: type 'a t = { mutable data : 'a array; mutable length : int } let make n v = { data = Array.make n v; length = n } let empty () = { data = [||]; length = 0 } let length a = a.length let get a i = if i < a.length then a.data.(i) else raise (Invalid_argument "Dynarray.get") let set a i v = if i >= a.length then begin if i >= Array.length a.data then begin let n = Array.length a.data in let newlen = max (i+1) (2*n) in let newdata = Array.make newlen v in Array.blit a.data 0 newdata 0 n; a.data <- newdata end; a.length <- i + 1 end; a.data.(i) <- v -- Eric C. Cooper e c c @ c m u . e d u |
From: Brian H. <bri...@ql...> - 2003-06-20 20:38:11
|
First off, an apology. I was grumpy this morning about things completely irrelevent to the list- and thus responded a lot more hotly than was called for. Sorry. And thank you for being rational enough not to flame in return. Unless I'm missing something (which is possible), there are three possible implementations of Dynarray that I can see working: 1) Have the internal array be a 'a option array, and not a 'a array as it is currently. This means we are going through at least one, maybe 2 references for every element, need it or not. If I understand things correctly, in this case a dynarray of booleans takes up 3 words per entry, and retrieving on takes two or three dereferences. 2) Use the null value kludge. I agree that this is far from optimal- but it allows me to avoid options were not needed, and still write in Ocaml. 3) Write in C, leaving floats unboxed. Actually, thinking about it, we could do this in C even keeping floats unboxed- if we don't allocate the array until the first element is added (and we know what type it is). I'm not opposed to option 3. Heck, I'd even support an implemention along those lines. I'm just unlikely to write it anytime soon. On Sat, 21 Jun 2003, John Max Skaller wrote: > There is no need to do that. The proposal is > to make an array of ints. And cast input values, > all of which are boxed, to int. And cast > values being fetched into single values, > all of which are boxed. > This won't work- it'll cause the GC to screw up. Since they're int's (the low bits are set), the GC won't follow any references. If the last reference to an object is in a Dynarray.t, the GC will think there aren't *any* references to the object at all, and collect the object. Then, when the reference is fished out of the int it was stored in and returned, all sorts of chaos will happen. Is there any sort of documentation kicking around about Obj? > > But one of the reasons I'm contributing to ExtLib is that I want to do > > some programming in *Ocaml*. But I'm willing to take it under advisement, > > and put it on my to do list. Just don't hold your breath. > > > I agree, it would be nice to do everything in Ocaml. > > But this is a fundamental data type it seems we cannot do > in Ocaml. > Actually, we can do it. Just have the inner array be a 'a option array instead of just a 'a array. > The null value is not acceptable at all. Just suppose > that you wanted to use a Dynarray.t to implement > some part of another polymorphic container .. > then you would have to pass the null value to IT too. > So the null value 'permeates' any data structure > you use it in. For example a list which automagically > converts to an array to improve performance. Yep. > > So if it is necessary to do a C implementation, > then do one, because it will make Ocaml > programming so much easier. > > I need a dynamic array. I'm using lists far too often. > I can implement it in Ocaml easily as an array ref, > but I'd have to copy the array on *every* append > operation. So take the time to write one. Write it once, write it correct, and submit the code. WRT to unboxed floats, I myself wouldn't be horrified. The only data structure in which floats are unboxed (to my knowledge) is arrays. If the data size is that important, use arrays. Actually, as on average about 1/2 of the dynarray is allocated by unused, I could make the argument that keeping floats in dynarrays unboxed actually uses less memory. > You have my sympathy. I have recently been blasting the > committee about the mess. But I'm on it, I'm entitled :-) I've been trying to keep the amount of C++ discussions on the Ocaml lists to an absolute minimum. As they are off topic. Brian |
From: John M. S. <sk...@oz...> - 2003-06-20 19:27:50
|
Brian Hurt wrote: > On Fri, 20 Jun 2003, John Max Skaller wrote: > > It is both fair and true. Notice that the examples I gave, all except > void* were unboxed. I don't know how many C++ compilers can figure out > that a Vector<foo*> and a Vector<bar*> can use the same object code. You didn't understand. They ALL can. Because it doesn't require any work by the compiler at all. It is the CLIENT that does it: template<class T> class vector<T*> { T* get(i) { static_cast<vector<void*> >(*this).get(i); } ... } All the compiler has to do is inline the get method call, which consists exclusively of type casting, and then dispatching to the common routine used by vectors of T* for all T. >>The principal problem in C++ here is the lack of >>a garbage collector. >> > > If this is the only reason you're using Ocaml and not C++, take a look at: > http://www.hpl.hp.com/personal/Hans_Boehm/gc/ > > Or you might take a gander at this language: > http://java.sun.com/ > I hear they're adding both templates and operator overloading in the next > version. Don't be silly. I said "the principle problem in C++ *here*" refering to the context of discussion. I meant: a vector of pointers in C++ isn't a substitute for a vector of T, because of memory management issues. You can do the very same trick for handles as raw pointers in C++, but that still won't handle recursive data structures .. > Don't have casts in Ocaml. you have Obj.magic, which is a cast. > I don't know of a way to force unboxing of > floats in an array without coding in C. There is no need to do that. The proposal is to make an array of ints. And cast input values, all of which are boxed, to int. And cast values being fetched into single values, all of which are boxed. If I could do that, Obj.magic() > would work. So we're talking about rewritting the whole thing in C. > Could be done. Don't use ints for the array type, use the Value type. Ok. > But one of the reasons I'm contributing to ExtLib is that I want to do > some programming in *Ocaml*. But I'm willing to take it under advisement, > and put it on my to do list. Just don't hold your breath. I agree, it would be nice to do everything in Ocaml. But this is a fundamental data type it seems we cannot do in Ocaml. The null value is not acceptable at all. Just suppose that you wanted to use a Dynarray.t to implement some part of another polymorphic container .. then you would have to pass the null value to IT too. So the null value 'permeates' any data structure you use it in. For example a list which automagically converts to an array to improve performance. So if it is necessary to do a C implementation, then do one, because it will make Ocaml programming so much easier. I need a dynamic array. I'm using lists far too often. I can implement it in Ocaml easily as an array ref, but I'd have to copy the array on *every* append operation. > And are you willing to gaurentee that no one will ever care that float > Dynarray.t's are boxed? "But float arrays are unboxed! This makes all my > data structures 50% larger than they need to be! Bloat! Horror! > Dynarrays should work just like arrays, but dynamic!" I can't guarrantee that, I'm just saying that if I have to choose, the null value's got to go. I won't use Dynarray otherwise, its already annoying to have to fill an array with a dummy value when I then immediately fill it with some other data: I can get around that with function, but it is often very inconvenient to have to wrap procedural code up. An important advantage of Dynarray is that it doesn't do gratuitous initialisations: neither wasting time nor requiring a dummy value. > Brian, who is snarly because he's having to wade through some atrocious > C++, and isn't in the mood to listen to people sing hosanahs to the > language. You have my sympathy. I have recently been blasting the committee about the mess. But I'm on it, I'm entitled :-) I'm not claiming C++ is a good language, especially compared to Ocaml. I'm just commenting that in this area, it is much better at handling unboxed data types, and therefore often can generated more efficient code. You are right that means *more* code, and also that in general, it instantiates needless duplicates of the same code... for example, the vector<void*> will often do for vector<int> on a Unix machine. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: John M. S. <sk...@oz...> - 2003-06-20 18:56:11
|
Nicolas Cannasse wrote: > Not exactly. > I can think of some samples which are worth the bugfix. > You can do iter2 on two enums, and then want to do "something" with the > remaining elements. You can do iter2 on 2 enums and raise an exception to get a premature exit, so the enums are both still populated. -- John Max Skaller, mailto:sk...@oz... snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia. voice:61-2-9660-0850 |
From: Brian H. <bri...@ql...> - 2003-06-20 15:48:01
|
On Fri, 20 Jun 2003, John Max Skaller wrote: > Brian Hurt wrote: > > > >>C++ vector does this correctly. > >> > > > > At the cost of emitting a whole new wad og object code for every single > > type you put into a vector. Vector<int> requires the compiler to produce > > a whole vector class to deal with ints, seperate from Vector<short>, > > Vector<char>, Vector<void *>, etc. > > > Your comparison is unfair and untrue. > > If you make the equivalent vector to ocaml, > you'd make a vector of pointers. It is both fair and true. Notice that the examples I gave, all except void* were unboxed. I don't know how many C++ compilers can figure out that a Vector<foo*> and a Vector<bar*> can use the same object code. I can easily envision tricky code where they couldn't. But I haven't looked recently, so they may (in the cases where they can). > The principal problem in C++ here is the lack of > a garbage collector. If this is the only reason you're using Ocaml and not C++, take a look at: http://www.hpl.hp.com/personal/Hans_Boehm/gc/ Or you might take a gander at this language: http://java.sun.com/ I hear they're adding both templates and operator overloading in the next version. > > In this case, you're going to be using a 'a option Dynarray.t one way or > > another. The question is wether you can ever get rid of the option part. > > > Nah, just box the floats. > Should be easy. Make the underlying array > type int. Use 0 as the null value. > Use casts to convert incoming and outgoing > values. Easy. Floats stay boxed though. > I don't care. > > Don't have casts in Ocaml. I don't know of a way to force unboxing of floats in an array without coding in C. If I could do that, Obj.magic() would work. So we're talking about rewritting the whole thing in C. Could be done. Don't use ints for the array type, use the Value type. But one of the reasons I'm contributing to ExtLib is that I want to do some programming in *Ocaml*. But I'm willing to take it under advisement, and put it on my to do list. Just don't hold your breath. And are you willing to gaurentee that no one will ever care that float Dynarray.t's are boxed? "But float arrays are unboxed! This makes all my data structures 50% larger than they need to be! Bloat! Horror! Dynarrays should work just like arrays, but dynamic!" Brian, who is snarly because he's having to wade through some atrocious C++, and isn't in the mood to listen to people sing hosanahs to the language. |