From: Martin J. <mar...@em...> - 2005-06-03 21:41:47
|
Hello, Some useful operations over the keys of a map are not provided by the standard Map module, neither by ExtLib's PMap, but are provided only by the Set module. Reason: removing a set of keys from a map of bindings could be done in O(n) with a Map.diff function, but only in O(n log n) with the current functions provided by the Map or Set modules. An interface for the new functions of PMap could look like this (the names are not very coherent, but it's just a draft): (* Shortcuts, just for clarity *) type ('a, 'b) map = ('a, 'b) t type 'a set = ('a, unit) map val diff : ('a, 'b) map -> ('a, 'c) map -> ('a, 'b) map val union : ('a, 'b) map -> (a, 'b) map -> ('a, 'b) map val key_union : ('a, 'b) map -> ('a, 'c) map -> 'a set val inter : ('a, 'b) map -> ('a, 'c) map -> ('a, 'b) map val inter2 : ('a, 'b) map -> ('a, 'c) map -> ('a, ('b, 'c)) map val key_inter : ('a, 'b) map -> ('a, 'c) map -> 'a set val key_subset : ('a, 'b) map -> ('a, 'c) map -> bool val key_equal : ('a, 'b) map -> ('a, 'c) map -> bool val keys : ('a, 'b) map -> 'a set val key_set : ('a, 'b) map -> 'a PSet.t val key_enum : ('a, 'b) map -> 'a Enum.t Implementation: essentially copy-pasting functions from Set, with minor modifications. Is it something that could be added to PMap in ExtLib? (or in ExtExtLib? :-) Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |
From: Nicolas C. <war...@fr...> - 2005-06-04 12:10:18
|
> Hello, > > Some useful operations over the keys of a map are not provided by the > standard Map module, neither by ExtLib's PMap, but are provided only by > the Set module. Yes I would also love to set diff / union / intersection for PMaps. But I'm not sure which semantic is appropriate. Maybe only have diff / union / inter operations on *keys* would be enough ? Please fell free to post implementation, maybe based on Set module. Nicolas |
From: Martin J. <mar...@em...> - 2005-06-06 00:43:12
|
On Sat, 4 Jun 2005, Nicolas Cannasse wrote: > > Hello, > > > > Some useful operations over the keys of a map are not provided by the > > standard Map module, neither by ExtLib's PMap, but are provided only by > > the Set module. > > Yes I would also love to set diff / union / intersection for PMaps. > But I'm not sure which semantic is appropriate. > Maybe only have diff / union / inter operations on *keys* would be enough ? > Please fell free to post implementation, maybe based on Set module. What do you think of this: (* default merge = fun x y -> x *) val union : ?merge:('b -> 'b -> 'b) -> ('a, 'b) t -> ('a, 'b) t -> ('a, 'b) t (* a merge function must be given to inter *) val inter : merge:('b -> 'c -> 'd) -> ('a, 'b) t -> ('a, 'c) t -> ('a, 'd) t val diff : ('a, 'b) t -> ('a, 'c) t -> ('a, 'b) t These functions would be quite powerful, I think. The idea is to merge the values which have the same key during the union and inter operations. The keys are supposed to be interchangeable like before (in Set or Map) if the comparison function says they are equal. So I started the implementation, but it requires the quasi-duplication and careful adaptation of a lot of code, essentially because the "merge" arguments are not symmetric in this interface. For now, only the following works: val union : ('a, 'b) t -> ('a, 'b) t -> ('a, 'b) t val inter : ('a, 'b) t -> ('a, 'b) t -> ('a, 'b) t val diff : ('a, 'b) t -> ('a, 'c) t -> ('a, 'b) t ... where it is not specified which value is kept when 2 bindings have the same key (union, inter), which is really not that great. So let me know what you think, before I continue this implementation. The code is attached to this email. It compiles but was not tested. Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |
From: Nicolas C. <war...@fr...> - 2005-06-06 05:54:18
|
> > > Hello, > > > > > > Some useful operations over the keys of a map are not provided by the > > > standard Map module, neither by ExtLib's PMap, but are provided only by > > > the Set module. > > > > Yes I would also love to set diff / union / intersection for PMaps. > > But I'm not sure which semantic is appropriate. > > Maybe only have diff / union / inter operations on *keys* would be enough ? > > Please fell free to post implementation, maybe based on Set module. > > What do you think of this: > > (* default merge = fun x y -> x *) > val union : ?merge:('b -> 'b -> 'b) -> ('a, 'b) t -> ('a, 'b) t -> ('a, 'b) t > > (* a merge function must be given to inter *) > val inter : merge:('b -> 'c -> 'd) -> ('a, 'b) t -> ('a, 'c) t -> ('a, 'd) t > > val diff : ('a, 'b) t -> ('a, 'c) t -> ('a, 'b) t Why not generalize a little more ? val union : merge:('b -> 'c -> 'd) -> ('a,'b) t -> ('a,'c) t -> ('a,'d) t The only problem is that we can't use "identity" as default merger. Nicolas |
From: Nicolas C. <war...@fr...> - 2005-06-06 06:57:39
|
> > What do you think of this: > > > > (* default merge = fun x y -> x *) > > val union : ?merge:('b -> 'b -> 'b) -> ('a, 'b) t -> ('a, 'b) t -> ('a, > 'b) t > > > > (* a merge function must be given to inter *) > > val inter : merge:('b -> 'c -> 'd) -> ('a, 'b) t -> ('a, 'c) t -> ('a, 'd) > t > > > > val diff : ('a, 'b) t -> ('a, 'c) t -> ('a, 'b) t > > Why not generalize a little more ? > > val union : merge:('b -> 'c -> 'd) -> ('a,'b) t -> ('a,'c) t -> ('a,'d) t > > The only problem is that we can't use "identity" as default merger. > > Nicolas After a second though I think it's little overweighted to make the user specify a merge function. Why not simply specify that in such cases, the item of the first map is selected ? That seems more reasonable. Nicolas |
From: Christophe R. <Chr...@un...> - 2005-06-06 07:55:14
Attachments:
signature.asc
|
> > > After a second though I think it's little overweighted to make the user > specify a merge function. Why not simply specify that in such cases, the > item of the first map is selected ? That seems more reasonable. > No, you can for instance implement multiset as map to natural ... and the merge function is then addition. Your merge function should even raise the exception Remove when you want to say that the result should be removed from the map ... (For insance, if you have Map to integer and want to say that 0 should be removed from the map). > Nicolas > -- Christophe Raffalli Université de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tél: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: Chr...@un... www: http://www.lama.univ-savoie.fr/~RAFFALLI --------------------------------------------- IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature. The public key is stored on www.keyserver.net --------------------------------------------- |
From: Nicolas C. <war...@fr...> - 2005-06-06 08:08:20
|
>> After a second though I think it's little overweighted to make the user >> specify a merge function. Why not simply specify that in such cases, the >> item of the first map is selected ? That seems more reasonable. >> > >No, you can for instance implement multiset as map to natural ... and >the merge function is then addition. > >Your merge function should even raise the exception Remove when you want >to say that the result should be removed from the map ... (For insance, >if you have Map to integer and want to say that 0 should be removed from >the map). I would say it's bad API design to use a "merge" function in order to remove elements. A more specific function should be provided. Nicolas |
From: Christophe R. <Chr...@un...> - 2005-06-06 08:17:52
Attachments:
signature.asc
|
Nicolas Cannasse wrote: >>>After a second though I think it's little overweighted to make the user >>>specify a merge function. Why not simply specify that in such cases, the >>>item of the first map is selected ? That seems more reasonable. >>> >> >>No, you can for instance implement multiset as map to natural ... and >>the merge function is then addition. >> >>Your merge function should even raise the exception Remove when you want >>to say that the result should be removed from the map ... (For insance, >>if you have Map to integer and want to say that 0 should be removed from >>the map). > > > I would say it's bad API design to use a "merge" function in order to remove > elements. > A more specific function should be provided. I do not think so. When your merge gives accidentally 0, and the value should be removed, you do not want to scan the map once more to remove the zero ... > Nicolas > -- Christophe Raffalli Université de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tél: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: Chr...@un... www: http://www.lama.univ-savoie.fr/~RAFFALLI --------------------------------------------- IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature. The public key is stored on www.keyserver.net --------------------------------------------- |
From: fva <fv...@ts...> - 2005-06-06 10:28:39
|
Hello Christophe Raffalli wrote: >Nicolas Cannasse wrote: > > >>>>After a second though I think it's little overweighted to make the user >>>>specify a merge function. Why not simply specify that in such cases, the >>>>item of the first map is selected ? That seems more reasonable. >>>> >>>> I like the idea of the merge function parameter... This makes sense when you do not rely on the identity to provide for equality of representations in the base set. >>>No, you can for instance implement multiset as map to natural ... and >>>the merge function is then addition. >>> >>>Your merge function should even raise the exception Remove when you want >>>to say that the result should be removed from the map ... (For insance, >>>if you have Map to integer and want to say that 0 should be removed from >>>the map). >>> >>> >>I would say it's bad API design to use a "merge" function in order to remove >>elements. >>A more specific function should be provided. >> >> > >I do not think so. When your merge gives accidentally 0, and the value >should be removed, you do not want to scan the map once more to remove >the zero ... > > In my opinion, this depends on the representation (i.e. eventually the semantics): 1) If you are representing the *support* i.e. those elements *not* mapping to the null element (whatever that means) you should remove those elements the merge operationg of which map to null 2) If you are representing the whole map/set/many-valued set whatever in its entirety, you should keep the merged value, whatever the result. But again, this is just an opinion... Regards, F. Valverde |
From: Martin J. <mar...@em...> - 2005-06-06 21:05:33
|
On Mon, 6 Jun 2005, Christophe Raffalli wrote: > Nicolas Cannasse wrote: > >>>After a second though I think it's little overweighted to make the user > >>>specify a merge function. Why not simply specify that in such cases, the > >>>item of the first map is selected ? That seems more reasonable. > >>> > >> > >>No, you can for instance implement multiset as map to natural ... and > >>the merge function is then addition. > >> > >>Your merge function should even raise the exception Remove when you want > >>to say that the result should be removed from the map ... (For insance, > >>if you have Map to integer and want to say that 0 should be removed from > >>the map). > > > > > > I would say it's bad API design to use a "merge" function in order to remove > > elements. > > A more specific function should be provided. > > I do not think so. When your merge gives accidentally 0, and the value > should be removed, you do not want to scan the map once more to remove > the zero ... I concur. Maybe we can use another name than "merge", but I like this use of exceptions. Here the term "exception" is really appropriate since it's an exception to the mathematical meaning of union (or intersection). Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |
From: Alain F. <Ala...@in...> - 2005-06-07 06:47:33
|
If you want to provide a generic point-wise binary combinator for maps, the type should be something like: (key -> 'a option -> 'b option -> 'c option) -> 'a t -> 'b t -> 'c t The first argument is never called with two None arguments, and you might prefer to use exception to signal elements to remove, instead of a None result, so another type could be: (key -> 'a -> 'c) -> (key -> 'a -> 'b -> 'c) -> (key -> 'b -> 'c) -> 'a -> 'b t -> 'c t In my experience, having such a generic point-wise combinator on map-like structures is really useful. Maybe you also want special cases (as a convenience for the user but also to optimize the implementation), but I really believe a generic version should be offered in the interface. -- Alain |
From: Nicolas C. <war...@fr...> - 2005-06-07 07:23:04
|
> If you want to provide a generic point-wise binary combinator for maps, > the type should be something like: > > (key -> 'a option -> 'b option -> 'c option) -> > 'a t -> 'b t -> 'c t > > The first argument is never called with two None arguments, and > you might prefer to use exception to signal elements to remove, instead > of a None result, so another type could be: > > (key -> 'a -> 'c) -> (key -> 'a -> 'b -> 'c) -> (key -> 'b -> 'c) -> > 'a -> 'b t -> 'c t > > > In my experience, having such a generic point-wise combinator on > map-like structures is really useful. Maybe you also want special cases > (as a convenience for the user but also to optimize the implementation), > but I really believe a generic version should be offered in the interface. > > -- Alain I agree too. One generic function for specific uses plus some specific functions merge / union / diff for fast (no boxing, no callback) and easy usage. Nicolas |
From: Nicolas C. <war...@fr...> - 2005-06-06 08:24:58
|
> > I would say it's bad API design to use a "merge" function in order to remove > > elements. > > A more specific function should be provided. > > I do not think so. When your merge gives accidentally 0, and the value > should be removed, you do not want to scan the map once more to remove > the zero ... Then it's no longer a merge... What you want is a more generic function that will do a merge+filter. Nicolas |
From: Martin J. <mar...@em...> - 2005-06-09 20:03:06
|
On Mon, 6 Jun 2005, Nicolas Cannasse wrote: > > > I would say it's bad API design to use a "merge" function in order to > remove > > > elements. > > > A more specific function should be provided. > > > > I do not think so. When your merge gives accidentally 0, and the value > > should be removed, you do not want to scan the map once more to remove > > the zero ... > > Then it's no longer a merge... > What you want is a more generic function that will do a merge+filter. "reduce" as suggested by Florian seems to be appropriate. The following signatures seem good to me: (* Exception Remove can be raised by the user-given functions and means "do not keep a binding to this key" for both union and inter. *) val union : ('key -> 'a -> 'c) -> ('key -> 'a -> 'b -> 'c) -> ('key -> 'b -> 'c) -> ('key, 'a) t -> ('key, 'b) t -> ('key, 'c) t val inter : ('key -> 'a -> 'b -> 'c) -> ('key, 'a) t -> ('key, 'b) t -> ('key, 'c) t val diff : ('key, 'a) t -> ('key, 'b) t -> ('key, 'a) t Plus maybe some specialized versions for efficiency and/or convenience. I am not sure if the gain in performance would be terrific, and specialized versions of union and inter can easily be defined out of PMap anyway, so I wouldn't insist too much on having those functions or not. Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |
From: Martin J. <mar...@em...> - 2005-06-09 20:48:48
|
On Thu, 9 Jun 2005, Martin Jambon wrote: [...] > Plus maybe some specialized versions for efficiency and/or convenience. > > I am not sure if the gain in performance would be terrific, and Actually the union between a big and a small map could be quite fast if some subtrees can be kept unmodified and don't have to be scanned (shared between the arguments and the result). The algorithm in this case is close to the one which is used in Set, but scanning systematically each element on both maps given as arguments is a bit different. Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |
From: John S. <sk...@us...> - 2005-06-09 23:33:14
|
On Thu, 2005-06-09 at 13:02 -0700, Martin Jambon wrote: > On Mon, 6 Jun 2005, Nicolas Cannasse wrote: > The following signatures seem good to me: > > (* Exception Remove can be raised by the user-given functions and means > "do not keep a binding to this key" for both union and inter. *) > > val union : > ('key -> 'a -> 'c) -> ('key -> 'a -> 'b -> 'c) -> ('key -> 'b -> 'c) -> > ('key, 'a) t -> ('key, 'b) t -> ('key, 'c) t > > val inter : > ('key -> 'a -> 'b -> 'c) -> ('key, 'a) t -> ('key, 'b) t -> ('key, 'c) t > > val diff : > ('key, 'a) t -> ('key, 'b) t -> ('key, 'a) t Just a minor comment, unrelated to the actual topic but concerning these names: it isn't clear from the name if 'diff' is setwise subtraction or symmetric difference. It isn't clear in the Ocaml standard library either. I believe 'diff' is actually subtraction, not the symmetric difference, which makes the choice of name extremely unfortunate. Both functions should be provided, they're both useful and expensive to compute with a multi-term expression eg: symdiff x y = sub (union x y) (inter x y) or sub x y = inter x (symdiff x y) Of course one can easily define both with only membership test and fold. Perhaps if the symmetric difference was provided and named symdiff it would be clear that 'diff' was the substraction. -- John Skaller, skaller at users.sf.net PO Box 401 Glebe, NSW 2037, Australia Ph:61-2-96600850 Download Felix here: http://felix.sf.net |
From: Christophe R. <Chr...@un...> - 2005-06-10 07:17:44
Attachments:
signature.asc
|
> symdiff x y = sub (union x y) (inter x y) > or > sub x y = inter x (symdiff x y) > > Of course one can easily define both with only membership > test and fold. > > Perhaps if the symmetric difference was provided and named > > symdiff > > it would be clear that 'diff' was the substraction. > > sub and symdiff are better name anyway ... -- Christophe Raffalli Université de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tél: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: Chr...@un... www: http://www.lama.univ-savoie.fr/~RAFFALLI --------------------------------------------- IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature. The public key is stored on www.keyserver.net --------------------------------------------- |
From: Martin J. <mar...@em...> - 2005-06-06 21:06:12
|
On Mon, 6 Jun 2005, Christophe Raffalli wrote: > > After a second though I think it's little overweighted to make the user > > specify a merge function. Why not simply specify that in such cases, the > > item of the first map is selected ? That seems more reasonable. > > > > No, you can for instance implement multiset as map to natural ... and > the merge function is then addition. > > Your merge function should even raise the exception Remove when you want > to say that the result should be removed from the map ... (For insance, > if you have Map to integer and want to say that 0 should be removed from > the map). Another function which could be useful would be one which adds one element, just like: fun key value m -> union (singleton key value) m but slightly more efficient, and accepting the same options as "union". Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |
From: Florian H. <ha...@bi...> - 2005-06-07 07:33:46
|
Nicolas Cannasse wrote: > After a second though I think it's little overweighted to make the user > specify a merge function. Why not simply specify that in such cases, the > item of the first map is selected ? If I wanted to implement something like googles map-reduce-thingy in ocaml (which may be the easiest way to get anything out of a multicore processor), I might just want to marshal the results to send them around and use this merge function as the reduce operation. So if you don't want to force the user to select a merge function, at least don't force him to reimplement union to be able to do so, i. e. make it let union ?(reduce = fun key v1 v2 -> v1) map1 map2 = ... I strongly support the exception proposal for unwanted values. Yours, Florian |
From: Martin J. <mar...@em...> - 2005-06-06 21:05:20
|
On Mon, 6 Jun 2005, Nicolas Cannasse wrote: > > > > Hello, > > > > > > > > Some useful operations over the keys of a map are not provided by the > > > > standard Map module, neither by ExtLib's PMap, but are provided only > by > > > > the Set module. > > > > > > Yes I would also love to set diff / union / intersection for PMaps. > > > But I'm not sure which semantic is appropriate. > > > Maybe only have diff / union / inter operations on *keys* would be > enough ? > > > Please fell free to post implementation, maybe based on Set module. > > > > What do you think of this: > > > > (* default merge = fun x y -> x *) > > val union : ?merge:('b -> 'b -> 'b) -> ('a, 'b) t -> ('a, 'b) t -> ('a, > 'b) t > > > > (* a merge function must be given to inter *) > > val inter : merge:('b -> 'c -> 'd) -> ('a, 'b) t -> ('a, 'c) t -> ('a, 'd) > t > > > > val diff : ('a, 'b) t -> ('a, 'c) t -> ('a, 'b) t > > Why not generalize a little more ? > > val union : merge:('b -> 'c -> 'd) -> ('a,'b) t -> ('a,'c) t -> ('a,'d) t > > The only problem is that we can't use "identity" as default merger. It cannot be so simple with the union function because you have to provide 2 additional functions of type ('b -> 'd) and ('c -> 'd) to convert the bindings which are not in both maps (cannot default to identity either). This could also be achieved by only one merge function of type ('b option -> 'c option -> 'd), but that's really heavy. In that case, I would first convert my maps of different types to the same type and then use the union function which expects 2 maps of the same type. Since the cost would be linear with respect to the size of the maps, I guess it's OK. Martin -- Martin Jambon, PhD http://martin.jambon.free.fr |