From: Martin J. <mar...@em...> - 2004-04-28 18:02:41
Attachments:
test_iterator.ml
|
[inspired by discussions on caml-list] On Wed, 28 Apr 2004, Nicolas Cannasse wrote: > Please state clearly your problems on the ExtLib mailling list, I will be > happy to think about theses :) > Enums are still young so there not yet specialized enums that can remove > elements while itering or give random access. Maybe in the future... Enum provides some kind of forward iterators (given the Enum.get function) - and much more. About iterators: With arrays, it is natural to expect some bidirectional behavior, including 'slice' iteration, but of course this is impossible if it has type Enum.t. Something nice could be done with objects: Iterator.array would create an object with methods "previous" and "next" while List.to_enum would provide only a "next" method. [here, "next" and "previous" are equivalents of ++ and --; they do not return a new iterator without modifying the original one] let _ = List.iter (fun obj -> obj#next) [ int_array_enum; int_list_enum ] A similar approach could be used without objects, like this: let (next_a, copy_a) = Iterator.Backward.array ~first: 3 [| 0; 1; 2; 3; 4; 5 |];; let (next_b, copy_b) = Iterator.Forward.list [ 0; 1; 2; 3 ];; Here is the test: let print next_a next_b = try while true do Printf.printf "%i %i\n" (next_a ()) (next_b ()) done with Iterator.End -> () let _ = let (next_a', _) = copy_a () in let (next_b', _) = copy_b () in ignore (next_a ()); print next_a next_b; print next_a' next_b' The output is: 4 0 3 1 5 0 4 1 3 2 I attach the full code. Compile with -rectypes: ocaml -rectypes test_iterator.ml I was happy to play with this... However I am not convinced about the usefulness of iterators in OCaml. I think that it could be an interesting approach in some cases, like the example about graphs that was posted: for recording the current position in the iteration. This requires some memory allocation. Efficiency matters, and this is why higher order functions seem preferable in most cases. Some additional remarks: 1) Hashtbl.enum works in O(n), is this allowed? 2) The implementation of Hashtbl.keys is terribly inefficient: let keys h = Enum.map (fun (k,_) -> k) (enum h) Why not: let list_keys h = fold (fun key data accu -> key :: accu) h [] let enum_keys h = List.enum (list_keys h) 3) Why making every module depend on Enum? I think that Enum should provide conversion functions from/to more basic containers. In the standard lib, there are Array.to_list and Array.of_list because lists are more important than arrays. Since Enum provides a super container, I think it would be more natural to make it depend on the core containers such as lists, arrays, and other things that could be implemented independently from Enum. Then Enum would have to define Enum.of_list, Enum.of_array and so on... and I would avoid its use over mutable data, except in the very specific cases of fixed length arrays and strings. Martin |
From: Bardur A. <oca...@sc...> - 2004-04-28 18:17:02
|
On Thu, Apr 29, 2004 at 02:00:58AM +0800, Martin Jambon wrote: > [inspired by discussions on caml-list] > 3) Why making every module depend on Enum? > I think that Enum should provide conversion functions from/to more basic > containers. In the standard lib, there are Array.to_list and Array.of_list > because lists are more important than arrays. > Since Enum provides a super container, I think it would be more natural > to make it depend on the core containers such as lists, arrays, and other > things that could be implemented independently from Enum. > Then Enum would have to define Enum.of_list, Enum.of_array and so on... > and I would avoid its use over mutable data, except in the very specific > cases of fixed length arrays and strings. > There are several reasons for this; two just off the top of my head: 1) This would require Enum to know about internals of other data structures -- which would mean that they would have to be exported! Not a good idea. 2) It would be cumbersome to extend Enum for user-defined data structures. You would have to do what has been done with ExtHashtbl, etc. for _every_ new ADT, and even then you would have conflicts if two developers were to release ADTs for which they had written such "wrappers". -- Bardur Arantsson <ba...@im...> <ba...@sc...> - In the forest lived... some trousers... called Dave. Richard Richard | Bottom |
From: Martin J. <mar...@em...> - 2004-04-29 07:32:03
|
On Wed, 28 Apr 2004, Bardur Arantsson wrote: > 1) This would require Enum to know about internals of > other data structures Really? Give me examples. Hashtbl does not work anyway because at least one step is not in constant time unless you modify the representation of hash tables to fit into the enum model. By the way, PMap.enum works in O(n) too. > -- which would mean that they would > have to be exported! Not a good idea. Enum is a client of other data structures. Without other data structures Enum is useless, not list neither arrays. > 2) It would be cumbersome to extend Enum for user-defined > data structures. You would have to do what has been done > with ExtHashtbl, etc. for _every_ new ADT, and even then > you would have conflicts if two developers were to release > ADTs for which they had written such "wrappers". If well-designed abstract data types are abstract and do not provide iterators, it is exactly because iterators are impossible, meaningless or simply unsafe? Martin |
From: Nicolas C. <war...@fr...> - 2004-04-29 14:54:11
|
> > 1) This would require Enum to know about internals of > > other data structures > > Really? > Give me examples. Hashtbl does not work anyway because at least one step > is not in constant time unless you modify the representation of hash > tables to fit into the enum model. Hashtbl is constant time. That means that cost of call to next() does not depends of the number of the keys in the Hashtbl. It actualy depends of the size of the Array, which is unrelated and bounded to the hashing function. Evaluating the exact complexity is a bit more difficult. > By the way, PMap.enum works in O(n) too. That's true. It should not. You're welcome to provide an O(1) implementation ! But that's a little bit tricky to do a depth tree traversal in an imperative way. You need to simulate a stack by hand. > > 2) It would be cumbersome to extend Enum for user-defined > > data structures. You would have to do what has been done > > with ExtHashtbl, etc. for _every_ new ADT, and even then > > you would have conflicts if two developers were to release > > ADTs for which they had written such "wrappers". > > If well-designed abstract data types are abstract and do not provide > iterators, it is exactly because iterators are impossible, meaningless or > simply unsafe? If you don't like enum, don't use them, and don't complain at people that are thinking they are, using poor arguments. Enum are functional iterators, they're different than the C++ STL ones, and are nice to write some algorithms that can work whatever the underlying data structure. Nicolas Cannasse |
From: Martin J. <mar...@em...> - 2004-04-29 17:41:06
|
On Thu, 29 Apr 2004, Nicolas Cannasse wrote: > > > 1) This would require Enum to know about internals of > > > other data structures > > > > Really? > > Give me examples. Hashtbl does not work anyway because at least one step > > is not in constant time unless you modify the representation of hash > > tables to fit into the enum model. > > Hashtbl is constant time. > That means that cost of call to next() does not depends of the number of the > keys in the Hashtbl. > It actualy depends of the size of the Array, which is unrelated and bounded > to the hashing function. > Evaluating the exact complexity is a bit more difficult. I am sorry to complain all the time, but you should seriously have a look at the implementation in the standard library before claiming that kind of wrong things. The length of the array is doubled every time that the number of bindings is found to be higher than 2 times the number of buckets. Therefore the array is approximately as long as the number of bindings, which is the expected property in a hash table (especially a resizable one). > > By the way, PMap.enum works in O(n) too. > > That's true. It should not. > You're welcome to provide an O(1) implementation ! > But that's a little bit tricky to do a depth tree traversal in an imperative > way. You need to simulate a stack by hand. let rec enum m = let rec next l () = match !l with [] -> raise Enum.No_more_elements | Empty :: tl -> l := tl; next l () | Node (m1, key, data, m2, h) :: tl -> l := m1 :: m2 :: tl; (key, data) in let count l () = let n = ref 0 in let r = ref !l in try while true do ignore (next r ()); incr n done; 0 with Enum.No_more_elements -> !n in let rec clone l () = let new_l = ref !l in Enum.make (next new_l) (count new_l) (clone new_l) in let l = ref [m.map] in Enum.make ~next: (next l) ~count: (count l) ~clone: (clone l) > > > 2) It would be cumbersome to extend Enum for user-defined > > > data structures. You would have to do what has been done > > > with ExtHashtbl, etc. for _every_ new ADT, and even then > > > you would have conflicts if two developers were to release > > > ADTs for which they had written such "wrappers". > > > > If well-designed abstract data types are abstract and do not provide > > iterators, it is exactly because iterators are impossible, meaningless or > > simply unsafe? > > If you don't like enum, don't use them, and don't complain at people that > are thinking they are, using poor arguments. Enum are functional iterators, > they're different than the C++ STL ones, and are nice to write some > algorithms that can work whatever the underlying data structure. I am still curious to see in what kind of situations you would use them. Please give us some examples. I can give several examples for the Hashtbl stuff that I already posted, and I don't expect it to be at the core of the future "standard" library for the Objective Caml language. I don't know what you plan to do about Int, Float and Bool additional modules, or with tables of properties, as recently proposed by other posters. And why don't you merge the project with the other ExtLib? If you don't trust other people's work, the result will be that someone else will decide to start one more general library project. Martin |
From: Nicolas C. <war...@fr...> - 2004-04-29 20:52:52
|
> > Hashtbl is constant time. > > That means that cost of call to next() does not depends of the number of the > > keys in the Hashtbl. > > It actualy depends of the size of the Array, which is unrelated and bounded > > to the hashing function. > > Evaluating the exact complexity is a bit more difficult. > > I am sorry to complain all the time, but you should seriously have a look > at the implementation in the standard library before claiming that kind of > wrong things. > The length of the array is doubled every time that the number of > bindings is found to be higher than 2 times the number of buckets. > Therefore the array is approximately as long as the number of bindings, > which is the expected property in a hash table (especially a resizable > one). You're true. I didn't look at how the hashtbl resizing was handled. I think then the best way to get an efficient enum is to not copy the Array, leaving the behavior of mutating the Hashtbl while enumerating on it unspecified (as it is for most of the mutable data structures in other langages). > let rec enum m = [...] Thanks, I slighty modified your code and commited it on the CVS. > > > If well-designed abstract data types are abstract and do not provide > > > iterators, it is exactly because iterators are impossible, meaningless or > > > simply unsafe? > > > > If you don't like enum, don't use them, and don't complain at people that > > are thinking they are, using poor arguments. Enum are functional iterators, > > they're different than the C++ STL ones, and are nice to write some > > algorithms that can work whatever the underlying data structure. > > I am still curious to see in what kind of situations you would use them. > Please give us some examples. The rational of Enums is the following : - we have increasing numbers of containers (list, arrays, dynarrays, hashtbl, pmap, .... ) - all theses containers need to implement all the functionnal ops : map, iter, fold... - conversions between each of the containers is a O(n^2). - enums are here to provide an easy way to convert between different containers, and to abstract them at the same time when using functionnal ops. There is nice side effects about their lazyness : - you never need to allocate intermediate data structure when filtering / mapping - you can work with a large collection of elements without any problem. For example, let's take a 1 M elements list. Each map will duplicate it into a brand new 1 M elements list. With enums you can stack several maps and filters, and then at the end, when you iter or fold, each readed element will go through your functionnal stack, be mapped or filtered by it, and eventually reach the output. Better : you can input to your algorithm some data coming from a list, a socket, an array, a file, or the container you need. Write once, use N times, where N is the number of "sources" for your data. If you're not convinced by this I don't know what I can add... > I can give several examples for the Hashtbl stuff that I already posted, > and I don't expect it to be at the core of the future "standard" library > for the Objective Caml language. Samples are not enough. I personaly prefer rationals. > I don't know what you plan to do about Int, Float and Bool additional > modules, or with tables of properties, as recently proposed by other > posters. Int , Float and Bool are useful with functors. ExtLib is functor-free and happy with it. Functors were already discussed and dismissed before : please watch the archives. We have a Polymorphic map (PMap) for Map replacement. The tables of properties are nice but IMHO are requiring more typing. OCaml is not python : providing a port of a Python library is certainly nice for interfacing, but as nothing to do in a general purpose library such as ExtLib. If you come up with a version of ConfigParser that can enforce the config file structure and types using a kind of typed-DTD then I'll be more than happy to put into ExtLib and promote it on the caml list as a good way to have persistent - still user readable - data . Having generators of caml type from the typed-DTD will be a killer. > And why don't you merge the project with the other ExtLib? This is out-of-question. Other ExtLib use C code and ExtLib is pure OCaml. > If you don't trust other people's work, the result will be that someone > else will decide to start one more general library project. I do trust, when there is some strong arguments in favor of an addition to the ExtLib : if I let everything end up into the ExtLib without some filtering and previous discussions over the usabality and general use of the stuff then it will grow like the Java library : big and unusable. Small and consistent is better. However I do understand your frustration and I'm making my best to answer your questions. Best Regards, Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-04-29 22:54:42
|
On Thu, Apr 29, 2004 at 10:48:31PM +0200, Nicolas Cannasse wrote: [--snip stuff about enums--] Thank you, sir. You said it so much better than I ever could. :) > OCaml is not python : providing a port of a Python library is certainly nice for > interfacing, but as nothing to do in a general purpose library such as > ExtLib. /me thinks it does. :) To me it would be so useful that it absolutely deserves to be part of ExtLib. (To me, interoperability is not part of the reason I think it should be included... see below). > If you come up with a version of ConfigParser that can enforce the > config file structure and types using a kind of typed-DTD [...] > Having generators of caml type from the typed-DTD will be a killer. Say what? It seems to me you're over-complicating things to the point of absurdity. There is a very good reason that the ConfigParser module in Python is like it is (and it has nothing to do with Python as a language): It is simply _the_ most convenient and _simple_ way to parse user-editable configuration files! There is nothing that comes close! I mean, what could possibly be easier than: let my_server = ConfigParser.get_string ~section:"global" "server" and my_port = ConfigParser.get_int ~section:"global" "port" and my_bla = ConfigParser.get_float ~section:"some_section" "bla" [...] You could even add a bit more compile-time checking by using section tags, as in: let global_section = ConfigParser.get_section ... "global" in let my_server = ConfigParser.get_string global_section "server" .... You may objects that this is not all that general in that it doesn't allow parsing of arbitrarily-typed values from the config file. But why should that be a concern? People simply do not need configuration settings that are more complicated than those already available in the ConfigParser(py) module. If they did, people would be asking the python developers for them! (and the python developer would likely be happy to oblige, they usually are :)) Wrt. specifying various types of values (int,float,string) by using different formatting in the config file... My question is simply: Why? I can see absolutely no benefit to doing this. You can still just as easily get exceptions at parse time, and if the user doesn't already know that some setting should be an int, then they have no business editing that config file! Sorry, if I come across as confrontational/annoyed, but I am a _huge_ fan of ConfigParser (the python one that is), I use it in all my Python-based projects where persistent configuration is even slightly useful.. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - I didn't have a blunt object with me, so I said "OK". David Letterman | The Late Show |
From: John G. <jgo...@co...> - 2004-04-30 05:16:13
|
On Fri, Apr 30, 2004 at 12:54:36AM +0200, Bardur Arantsson wrote: > Sorry, if I come across as confrontational/annoyed, but I > am a _huge_ fan of ConfigParser (the python one that is), > I use it in all my Python-based projects where persistent > configuration is even slightly useful.. Stay tuned; my missinglib for OCaml will support Python-style interpolation very shortly.... This should render it fully compatible with Python's SafeConfigParser, and give it some extra features to boot. That is, OCaml's library can parse files generated by Python's save(), and vice-versa, and get the same results out of them. -- John |
From: Nicolas C. <war...@fr...> - 2004-04-30 07:17:40
|
> On Fri, Apr 30, 2004 at 12:54:36AM +0200, Bardur Arantsson wrote: > > Sorry, if I come across as confrontational/annoyed, but I > > am a _huge_ fan of ConfigParser (the python one that is), > > I use it in all my Python-based projects where persistent > > configuration is even slightly useful.. > > Stay tuned; my missinglib for OCaml will support Python-style > interpolation very shortly.... This should render it fully compatible > with Python's SafeConfigParser, and give it some extra features to boot. > That is, OCaml's library can parse files generated by Python's save(), > and vice-versa, and get the same results out of them. > > -- John Could you interface it with IO ? so one can read and write config files. What is exactly "interpolation" ? Regards, Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-04-30 08:00:02
|
On Fri, Apr 30, 2004 at 09:16:44AM +0200, Nicolas Cannasse wrote: > > On Fri, Apr 30, 2004 at 12:54:36AM +0200, Bardur Arantsson wrote: > > > Sorry, if I come across as confrontational/annoyed, but I > > > am a _huge_ fan of ConfigParser (the python one that is), > > > I use it in all my Python-based projects where persistent > > > configuration is even slightly useful.. > > > > Stay tuned; my missinglib for OCaml will support Python-style > > interpolation very shortly.... This should render it fully compatible > > with Python's SafeConfigParser, and give it some extra features to boot. > > That is, OCaml's library can parse files generated by Python's save(), > > and vice-versa, and get the same results out of them. > > > > -- John > > Could you interface it with IO ? so one can read and write config files. > What is exactly "interpolation" ? > Interpolation is basically just variable susbstitution; as in the following Perl snippet: print "This is the variable value: $bla\n" where the value of the "bla" variable from the surrounding scope is inserted into the string. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - Disco Stu doesn't advertise. Disco Stu | The Simpsons |
From: John G. <jgo...@co...> - 2004-04-30 13:11:41
|
On Fri, Apr 30, 2004 at 09:16:44AM +0200, Nicolas Cannasse wrote: > > Stay tuned; my missinglib for OCaml will support Python-style > > interpolation very shortly.... This should render it fully compatible > > with Python's SafeConfigParser, and give it some extra features to boot. > > That is, OCaml's library can parse files generated by Python's save(), > > and vice-versa, and get the same results out of them. > > > > -- John > > Could you interface it with IO ? so one can read and write config files. > What is exactly "interpolation" ? It currently works with standard channels, but yes, I could do that. (I wouldn't feel comfortable doing it until IO is in a released Extlib, though). Interpolation means inserting sprintf-like statements in the config file to refer to other values in the file or to values in a hash table supplied by the programmer. For instance: [DEFAULTS] photos = /home/foo/photos [album1] path = %(photos)s/firstalbum copyright = $(copyright)s In this instance, path will expand to /home/foo/photos/firstalbum and copyright will expand to some value that the programmer passes in to ConfigParser in a Hashtable (with key "copyright"). Or, if the programmer doesn't do that, it will cause an error. This interpolation is compatible with Python's SafeConfigParser and will be optional. -- John |
From: Nicolas C. <war...@fr...> - 2004-05-02 10:22:25
|
> On Fri, Apr 30, 2004 at 09:16:44AM +0200, Nicolas Cannasse wrote: > > > Stay tuned; my missinglib for OCaml will support Python-style > > > interpolation very shortly.... This should render it fully compatible > > > with Python's SafeConfigParser, and give it some extra features to boot. > > > That is, OCaml's library can parse files generated by Python's save(), > > > and vice-versa, and get the same results out of them. > > > > > > -- John > > > > Could you interface it with IO ? so one can read and write config files. > > What is exactly "interpolation" ? > > It currently works with standard channels, but yes, I could do that. > (I wouldn't feel comfortable doing it until IO is in a released Extlib, > though). > > Interpolation means inserting sprintf-like statements in the config file > to refer to other values in the file or to values in a hash table > supplied by the programmer. For instance: > > [DEFAULTS] > photos = /home/foo/photos > > [album1] > path = %(photos)s/firstalbum > copyright = $(copyright)s > > In this instance, path will expand to /home/foo/photos/firstalbum and > copyright will expand to some value that the programmer passes in to > ConfigParser in a Hashtable (with key "copyright"). Or, if the > programmer doesn't do that, it will cause an error. > > This interpolation is compatible with Python's SafeConfigParser and will > be optional. Thank you for the information. Here's my opinion about ConfigParser : it's nice to provide some config data file format handling, it's also nice to provide Python compatibility. But I'm confused about what's the goal of this library : if this file format is supposed to store app config data then I don't see the point of having compatibility with Python. Since the config data is app-specific, the only case that could be interesting to have compatibility is to have a big application, some parts written in OCaml and some other parts in Python, with both accessing the same configuration data : that's not general case ! Now if you want to get compatibility, that's maybe because you don't only want this file to store "config" data but data only. And for this purpose I think XML is a far better description format since it allows recursivity, proving using a DTD or Schema, and is widely used and available in a lot more langages than Ocaml and Python. Now if your goal is really to store app config data then I don't undestand why to keep Python compatibility, and why actually not to represent the data into XML ? The .INI file format is far from being the best around. The average user, either knows XML and will be able to edit by-hand or will not even try to edit and modify your config file, unless you provide him a nice GUI. Regards, Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-05-02 11:04:06
|
On Sun, May 02, 2004 at 12:20:16PM +0200, Nicolas Cannasse wrote: > > On Fri, Apr 30, 2004 at 09:16:44AM +0200, Nicolas Cannasse wrote: > > > > Stay tuned; my missinglib for OCaml will support Python-style > > > > interpolation very shortly.... This should render it fully > compatible > > > > with Python's SafeConfigParser, and give it some extra features to > boot. > > > > That is, OCaml's library can parse files generated by Python's save(), > > > > and vice-versa, and get the same results out of them. > > > > > > > > -- John > > > > > > Could you interface it with IO ? so one can read and write config files. > > > What is exactly "interpolation" ? > > > > It currently works with standard channels, but yes, I could do that. > > (I wouldn't feel comfortable doing it until IO is in a released Extlib, > > though). > > > > Interpolation means inserting sprintf-like statements in the config file > > to refer to other values in the file or to values in a hash table > > supplied by the programmer. For instance: > > > > [DEFAULTS] > > photos = /home/foo/photos > > > > [album1] > > path = %(photos)s/firstalbum > > copyright = $(copyright)s > > > > In this instance, path will expand to /home/foo/photos/firstalbum and > > copyright will expand to some value that the programmer passes in to > > ConfigParser in a Hashtable (with key "copyright"). Or, if the > > programmer doesn't do that, it will cause an error. > > > > This interpolation is compatible with Python's SafeConfigParser and will > > be optional. > > Thank you for the information. > > Here's my opinion about ConfigParser : it's nice to > provide some config data file format handling, it's also > nice to provide Python compatibility. But I'm confused > about what's the goal of this library : if this file > format is supposed to store app config data then I don't > see the point of having compatibility with Python. Since > the config data is app-specific, the only case that could > be interesting to have compatibility is to have a big > application, some parts written in OCaml and some other > parts in Python, with both accessing the same > configuration data : that's not general case ! I don't what his reasons are for wanting Python compatibility; but mine are very simple: - The format works! - Users don't want to have to learn a different format. - There is *no* good reason *NOT* to have compatibility. > > Now if you want to get compatibility, that's maybe > because you don't only want this file to store "config" > data but data only. > > And for this purpose I think XML is a far better > description format since it allows recursivity, proving > using a DTD or Schema, and is widely used and available > in a lot more langages than Ocaml and Python. XML is overkill for almost everything people like to put into XML these days -- it's also much much harder to parse (even if you have a relatively simple DTD I've found that one tends to gets bogged down by detail; but maybe that's just because of the XML parsers I've tried (yes, that includes PXP)). ... not that I'm saying that storing actual app. data (as opposed to app. configuration data) in .ini is a good idea either. But that is beside the point as I don't think anyone is suggesting that. > Now if your goal is really to store app config data then > I don't undestand why to keep Python compatibility, and > why actually not to represent the data into XML ? > The .INI file format is far from being the best around. Granted, but you're overlooking its usefulness in situations where you don't NEED any more features than .INI has. XML is utter overkill for something so simple as configuration data which is not hierarchical in nature. And even *if* your configuration data is hierarchical, XML has many drawbacks: - Contrary to popular belief, XML is not user-editable in the same way that any sane configuration format is user-editable: - XML is incredibly verbose; take a look a typical XML config versus e.g. the sshd config file. Then come back and suggest XML as a configuration file format. It's very hard to seperate the wheat from the chaff in your typical XML file. This makes it hard to edit. - There are special rules for symbols like "&", "<" and ">", "&", etc. Users do not want to have to use special "escaped" entities when they are not actually necessary. (see also next section) - XML is way more complex than it needs to be: Take a look at YAML (http://www.yaml.org/). It does the same thing everybody seems to want XML for and does it in a much simpler, neater *and* more user-friendly way. (more, but I can't bothered to list all of it...) > The average user, either knows XML and will be able to > edit by-hand or will not even try to edit and modify your > config file, unless you provide him a nice GUI. > Nonsense. To be able to edit *YOUR* XML the user must know at least two very non-obvious things: 1) Proper nesting rules for *your* DTD. (Which of course varies from app to app, so you can't hide behind the "everbody knows XML" bit). That means that they need to know how to read DTDs, or that you have to provide a layman's explanation of the format -- more work for you (for no good reason since .ini/yaml is much easier for both you *and* your users). 2) Allowable attribute values (which cannot usually be specified in the DTD anyway -- so the DTD model fails here) A .ini is trivially learned by comparison. Even if you sat a complete beginner down in front of a computer, they would be able to change the settings (given a text editor :)). Another counter-point: Not all apps have GUIs, but want users to be able to set configuration settings without having to understand XML. What are you going to then? (I could go on and on about the failings of XML, but you'll probably get tired of reading it, so I'll just stop here...) -- Bardur Arantsson <ba...@im...> <ba...@sc...> - With this weapon I can expose fictional characters and bring about sweeping reforms! Zippy | Zippy the Pinhead |
From: Bardur A. <oca...@sc...> - 2004-05-02 11:44:02
|
On Sun, May 02, 2004 at 01:03:56PM +0200, Bardur Arantsson wrote: [--snip--] Forgot: Most configuration file formats are not actually hierarchical in nature; so the one area where XML *might* have an advantage over .ini is not actually common usage. Btw, if you want to read more about why "XML sucks", you might want to read this: http://c2.com/cgi/wiki?XmlSucks Despite the name it is actually quite balanced and informed discussion -- the general consensus seems to be that XML is bad for almost anything except data transmission and (archival) storage. -- Bardur Arantsson <ba...@im...> <ba...@sc...> - I *never* apologize Lisa. I'm sorry, but that's just the way I am. Homer Simpson | The Simpsons |
From: Nicolas C. <war...@fr...> - 2004-05-02 12:03:54
|
> [--snip--] > > Forgot: Most configuration file formats are not actually > hierarchical in nature; so the one area where XML *might* > have an advantage over .ini is not actually common usage. That's exactly what I'm saying : - if you need a configuration file .INI format is ok but I don't see the point of being compatible with Python then, since I don't understand why I would need "portable" configuration files. - if you need data exchange then XML is the standard de-facto. > Btw, if you want to read more about why "XML sucks", you > might want to read this: http://c2.com/cgi/wiki?XmlSucks > > Despite the name it is actually quite balanced and > informed discussion -- the general consensus seems to be > that XML is bad for almost anything except data > transmission and (archival) storage. What's an application doing, except data transmission and storage ? More generaly, I agree that XML standards (all the XML "stack") is too difficult and actually "sucks", but the XML file format itself is well supported for many langages. You can have a look at my XmlLight parser which is not maybe 100% compliant with the thousands of pages of the differents standards (although maybe it doesn't need more to be) but is actually easy to use and lightweight : http://tech.motion-twin.com I'm not advocating XML here, I just say that if you want users : a) not to learn one more format b) have a wide support of your format for many langages c) use this format for both "config" data and "exportable" data then you should definitly use XML BTW, YAML looks interesting, I'll read some about it later. Is there a parser available for OCaml ? If we had a way to parse and print YAML and XML files with conversions between them, would not that be the perfect solution ? Regards, Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-05-02 12:33:07
|
On Sun, May 02, 2004 at 02:01:45PM +0200, Nicolas Cannasse wrote: > > [--snip--] > > > > Forgot: Most configuration file formats are not actually > > hierarchical in nature; so the one area where XML *might* > > have an advantage over .ini is not actually common usage. > > That's exactly what I'm saying : > - if you need a configuration file .INI format is ok but I don't see the > point of being compatible with Python then, since I don't understand why I > would need "portable" configuration files. You're right in the sense that it's not actually *necessary* to be Python-compatible. But if the format works and is easy to deal with (for users and programmers alike), what's the problem with being python-compatible? In other words: It seems to me that you're complaining about an aspect of CP which is actually a non-issue; it doesn't cost anything to be compatible, so why not be compatible? You haven't offered any good reasons to *not* be compatible... > - if you need data exchange then XML is the standard de-facto. That something is a de-factor standard does not necessarily make it a good idea (see: SMTP, Microsoft Windows :), etc.). > > > Btw, if you want to read more about why "XML sucks", you > > might want to read this: http://c2.com/cgi/wiki?XmlSucks > > > > Despite the name it is actually quite balanced and > > informed discussion -- the general consensus seems to be > > that XML is bad for almost anything except data > > transmission and (archival) storage. > > What's an application doing, except data transmission and storage ? Oh, I don't know... computation? > > More generaly, I agree that XML standards (all the XML "stack") is too > difficult and actually "sucks", but the XML file format itself is well > supported for many langages. > You can have a look at my XmlLight parser which is not maybe 100% compliant > with the thousands of pages of the differents standards (although maybe it > doesn't need more to be) but is actually easy to use and lightweight : > http://tech.motion-twin.com > > I'm not advocating XML here, I just say that if you want users : > a) not to learn one more format They **do** have to learn a new format (as I pointed out); amongst other things they have to learn what elements you are using in your configuration file and how they can be nested. The have to learn what values are allowed for those elements -- for instance, there is no way in XML (using the DTD) to specify that an element can only take on floating-point values (or the format of those values for that matter). The only thing they don't have to learn is the *syntax* of the file; the DTD says almost nothing about the semantics of the configuration elements (like "are float or integer or string or lists of strings allowed here?"). So you end up having to do validation yourself anyway. The syntax of .ini is so simple that there is basically _no_ learning curve (even if you have never seen such a file before)! You still have to learn what the configuration values actually *mean*, but that is true in any case. > b) have a wide support of your format for many langages Frankly, I think this is a red herring. If you're interested in reading the XML of another application, then you actually need to understand the *semantics* of that XML, something which XML/DTD does not help you with. So the only thing you're getting for free is (essentially) a parser which can match <bla> to </bla>. Whoop-dee-doo. Actually understanding the semantics of the XML you read is much harder than writing a parser for the syntax of almost any interchange format imaginable... > c) use this format for both "config" data and "exportable" data > then you should definitly use XML Debatable. For relatively simple tabular data, CSV may actually be much better than XML (both in terms of efficientcy and user-editability). > > BTW, YAML looks interesting, I'll read some about it later. Is there a > parser available for OCaml ? I looked for one some months ago, but couldn't find one. (Actually though about writing one myself, but I couldn't find the time...) > If we had a way to parse and print YAML and XML files with conversions > between them, would not that be the perfect solution ? > That's impossible, because XML has no notion of what data actually *is*. Anything within an element is just raw data to the XML parser -- the application has to make sense of it. (You could probably specify a DTD which encompassed the same document structure as you have in YAML, but it would be pretty useless -- you could just use YAML directly instead). -- Bardur Arantsson <ba...@im...> <ba...@sc...> - Consider the universe, or, more precisely, algebra... Bill Gosper |
From: Richard J. <ri...@an...> - 2004-05-03 12:10:59
|
A good reason to go with compatibility is that it encourages people to write different parts of an app in different languages. In this instance you could have a Python bit (lightweight scripting), and an OCaml bit (heavy-lifting), both reading the same configuration file format. Rich. -- Richard Jones. http://www.annexia.org/ http://www.j-london.com/ Merjis Ltd. http://www.merjis.com/ - improving website return on investment http://www.winwinsales.co.uk/ - CRM improvement consultancy |
From: Nicolas C. <war...@fr...> - 2004-05-02 12:03:44
|
> > The average user, either knows XML and will be able to > > edit by-hand or will not even try to edit and modify your > > config file, unless you provide him a nice GUI. > > > > Nonsense. To be able to edit *YOUR* XML the user must know > at least two very non-obvious things: > > 1) Proper nesting rules for *your* DTD. (Which of course > varies from app to app, so you can't hide behind the > "everbody knows XML" bit). That means that they need to > know how to read DTDs, or that you have to provide a layman's > explanation of the format -- more work for you (for no > good reason since .ini/yaml is much easier for both you *and* > your users). > > 2) Allowable attribute values (which cannot usually be > specified in the DTD anyway -- so the DTD model fails > here) > > A .ini is trivially learned by comparison. Even if you sat > a complete beginner down in front of a computer, they > would be able to change the settings (given a text editor > :)). I don't see the point. DTD is here to enforce the structure of your XML, so you don't have to check is everything is well defined at runtime (the structure of your XML is proved after parsing). DTD is tool for the application writer, and can be seen as well as a guideline for the user, although I'm not expecting users to look at the DTD to see how they should write your XML. You also need to specify the structure of you INI at runtime, so what's difference for the user between : [CONFIG] maxlines = 10 and <config maxlines="10"/> ? Will it really be impossible for him to "learn xml" and replace the value of 10 by whatever he wants ? I agree that XML is a bit more verbose, but there is no fundamental difference between XLM and INI, except the syntax and the fact that XML allow for recursive data (when needed) and is widely supported by a lot - actually all ? - programming langages. Regards, Nicolas Cannasse |
From: Bardur A. <oca...@sc...> - 2004-05-02 13:13:07
|
On Sun, May 02, 2004 at 02:01:34PM +0200, Nicolas Cannasse wrote: > > > The average user, either knows XML and will be able to > > > edit by-hand or will not even try to edit and modify your > > > config file, unless you provide him a nice GUI. > > > > > > > Nonsense. To be able to edit *YOUR* XML the user must know > > at least two very non-obvious things: > > > > 1) Proper nesting rules for *your* DTD. (Which of course > > varies from app to app, so you can't hide behind the > > "everbody knows XML" bit). That means that they need to > > know how to read DTDs, or that you have to provide a layman's > > explanation of the format -- more work for you (for no > > good reason since .ini/yaml is much easier for both you *and* > > your users). > > > > 2) Allowable attribute values (which cannot usually be > > specified in the DTD anyway -- so the DTD model fails > > here) > > > > A .ini is trivially learned by comparison. Even if you sat > > a complete beginner down in front of a computer, they > > would be able to change the settings (given a text editor > > :)). > > I don't see the point. > DTD is here to enforce the structure of your XML, so you don't have to check > is everything is well defined at runtime (the structure of your XML is > proved after parsing). Structure isn't everything. In fact structure is probably the least of your worries. You still have to check that all the _values_ (of which the DTD effectively says *nothing*) are semantically sound. There are also valid structural constraints which cannot be expressed in DTDs; so you have to check those yourself anyway. > DTD is tool for the application writer, and can be > seen as well as a guideline for the user, although I'm not expecting users > to look at the DTD to see how they should write your XML. But then, how would they know? > > You also need to specify the structure of you INI at runtime, so what's > difference for the user between : > > [CONFIG] > maxlines = 10 > > and > > <config maxlines="10"/> > > ? Let's add another setting shall we? [CONFIG] maxlines = 10 maxpants = 5 maxtomatos = 1 So now your XML becomes: <config maxlines="10"/> <config maxpants="5"/> <config maxtomatos="5"/> Notice the repetition? "Ah, but we can do it differently", you say? Well, <config> <maxlines value="10"/> <maxpants value="5/> <maxtomatos value="5"/> </config> might look slightly better, but we haven't really gained anything. It's in fact just as verbose, but now the parser (that is, the configuration parser) has to understand the hierarchical structure. (You might argue that adding code to understand the hierarchy is trivial, but the fact remains that you need to add that code...) And also, the user cannot actually tell that "maxlines" and "maxpants" don't take string values... You might that this is obvious with those names, but what if we had a setting called "servers"? It that a list of names or is it a number? This is neatly encoded into the .ini file format in a very obvious and above all else user-friendly way... (Well, technically you can supply int values for values which might also take string values: servers=10 might also mean that you in fact were specifying that you have a server named "10". But that is again somewhat beside the point). > Will it really be impossible for him to "learn xml" and replace the value of > 10 by whatever he wants ? Let's look at a more reasonable example, shall we...? I distinctly remember installing a jabber server some time ago. I doesn't actually have a hierarchical configuration but they decided to go for an XML-based config file format instead (hey, they'd already implemented an XML parser for the jabber protocol, so why not?). That file was about 16K in size, and the actual "size" of the settings you could configure was about 1-2K. And I remember it precisely because just because reading through the configuration file it became painfully obvious just how bad an idea having it in XML was. > I agree that XML is a bit more verbose, but there is no fundamental > difference between XLM and INI, except the syntax and the fact that XML > allow for recursive data (when needed) and is widely supported by a lot - > actually all ? - programming langages. > You're underestimating the user-editability aspect. Besides, .ini does actually support "hierarchical" configuration data -- at least the way it's used in most configurations: You see, most configuration data is not actually arbitrarily deeply nested (which is basically what XML can support that .INI supposedly can't). The .INI format can support any non-arbitrarily deeply nested structure (that's what named sections are for!). You can even support arbitrarily deep section hierarchies very simply: [config] my_value = 5 next = config2 [config2] my_value = 3 next = config3 [...] Before you say "this is ugly!", yes, I agree to some extent... but it is actually _still_ less verbose than XML would be, and almost nobody would actually _need_ to do such ugly things... I have used something similar once though: For a particular progam I used a configuration file like this (names have been changed to protect the innocent :)): [Choices] choices=choice1,choice2,choice3 [choice1] description=Description1 id=ID1 [choice2] description=Description2 id=ID2 [choice3] description=Description3 id=ID3 (That's pretty self-explanatory isn't it?) Parsing this in anything XML-like would take oodles of code (at least using PXP), but as it turned out I could parse it and put it into an internally useful form in about 3-4 lines of Python. (This doesn't have anything to do with the nature of the Python language itself, it's simply because ConfigParser is genius!) I probably won't say more on the matter of XML vs. anything else, but would like to register my vote for including ConfigParser in ExtLib. I would be a very happy camper indeed if that were to happen. :) -- Bardur Arantsson <ba...@im...> <ba...@sc...> - I don't recommend illegal drugs, guns, insanity and violence to everyone, but they've always worked for me. Dr. Hunter S. Thompson |
From: <Ala...@en...> - 2004-05-02 13:19:41
|
On Sun, 2 May 2004, Bardur Arantsson wrote: > Granted, but you're overlooking its usefulness in > situations where you don't NEED any more features than > .INI has. XML is utter overkill for something so simple as > configuration data which is not hierarchical in nature. I can think of many situations where the configuration data is hierarchical (even if not recursive). Have a look at a config file for Apache with several virtual hosts, for instance. One of the benefits of using XML is the clean treatment of encodings. > - XML is incredibly verbose; take a look > a typical XML config versus e.g. the sshd config file. > Then come back and suggest XML as a configuration file > format. It's very hard to seperate the wheat from the > chaff in your typical XML file. This makes it hard to > edit. You take one of the simplest example. Mmmh, so, what is the exact syntax for comments in these files ? I have to look at the man page to know if it is ok the use # in the middle of a line. If the config file contained text (which is not the case for sshd AFAIK), what would be the encoding ? An XML syntax could be: <sshd_config port="22" hostkey="/etc/ssh_host_key" ... /> Is it really that complex ? A more realistic design would organize the config file by section. Now look at the config file for MTAs, web server, ml-donkey, etc... They all require much more structure than a simple key-data list. I agree that XML has many concrete (syntactic) problems, but it is standard and the richer data model is sometimes useful. > A .ini is trivially learned by comparison. So is an XML config file. What's your point ? -- Alain |
From: Bardur A. <oca...@sc...> - 2004-05-02 13:34:37
|
On Sun, May 02, 2004 at 03:19:27PM +0200, Ala...@en... wrote: > On Sun, 2 May 2004, Bardur Arantsson wrote: > > > Granted, but you're overlooking its usefulness in > > situations where you don't NEED any more features than > > .INI has. XML is utter overkill for something so simple as > > configuration data which is not hierarchical in nature. > > I can think of many situations where the configuration data is > hierarchical (even if not recursive). Have a look at a config file for > Apache with several virtual hosts, for instance. I was just looking through my myriad of files in /etc, and you know what... I could only find *ONE* configuration file that used a hierarchical structure. That must mean something... > > One of the benefits of using XML is the clean treatment of encodings. > > > - XML is incredibly verbose; take a look > > a typical XML config versus e.g. the sshd config file. > > Then come back and suggest XML as a configuration file > > format. It's very hard to seperate the wheat from the > > chaff in your typical XML file. This makes it hard to > > edit. > > You take one of the simplest example. Well, yes, but that was just to make a point. :) > > Mmmh, so, what is the exact syntax for comments in these files ? AFAICR, # and ; will both work (if you're referring to INI files, that is). That may not be obvious, but neither is <!-- BLA --> > I have to look at the man page to know if it is ok the use # in the > middle of a line. If the config file contained text (which is not the > case for sshd AFAIK), what would be the encoding ? > > An XML syntax could be: > > <sshd_config > port="22" > hostkey="/etc/ssh_host_key" > ... > /> > > Is it really that complex ? It's _unneeded_ complexity that I have a problem with. Not coplexity per se. If you changed the sshd_config like you suggested, then you would now need to list all possible settings twice: Once in the DTD and in the actual code which parses the file. That's just bad software engineering practice... > > A more realistic design would organize the config file by section. That is what .INI does. > > Now look at the config file for MTAs, web server, ml-donkey, etc... > They all require much more structure than a simple key-data list. Well, I can only speak of what I know, so... MTAs: Exim uses it's own format and thank $DEITY for that. It's simple and to the point -- In fact I'd go so far as to call it beatiful (at least relative to all other mail servers I've tried). Sendmail doesn't use XML and I don't think it's hierarchical. Neither does Postfix (I don't think that it's hierarchical either, but I could be wrong). Web server: Apache doesn't use XML. Yes, it's hierarchical, but their parser can actually distinguish between when something is supposed to be an int and not a string (wihtout having to do a second pass as it were). The reason that it's hierarchical is NOT virtual hosts -- they cannot be nested, but the recursive nature of permissions on DIRECTORIES. So there. ml-donkey: No idea about this one... > > I agree that XML has many concrete (syntactic) problems, but it > is standard and the richer data model is sometimes useful. > > > A .ini is trivially learned by comparison. > > So is an XML config file. What's your point ? > An XML config file is learned trivially by comparison to learning an XML config file? That makes no sense to me... Point = it's easier for a person unfamiliar with XML or INI to learn to modify an INI file. That's all.... -- Bardur Arantsson <ba...@im...> <ba...@sc...> - It might look like I'm standing motionless, but I'm actively waiting for my problems to go away. Scott Adams |
From: Martin J. <mar...@em...> - 2004-05-02 15:16:48
|
Hello, I think it would be nice to have access to manipulation of very simple config files: maxlines = 100 # this is an int server = "machin.toto.org" # this is a string For more complex situations when hierarchical data must be passed, well, I am not convinced that any predefined syntax could be satisfying. And if you are writing a complex application, you have time to choose or design or redesign an appropriate format. For information, in several applications I already used a format like this one: { (* we start an anonymous record *) a = 1; (* this is a labeled field *) b = 2; c = "hello"; (* field "list" is bound to a record *) list = { 1; (* has type int *) 2; 3; "gouzou"; (* has type string *) 2.340; (* has type float *) true; (* has type bool *) 'x'; (* has type char *) } /list; (* the closing tag is optional but must be correct *) "coucou"; (* this is an anonymous field *) 33; some_string = file_contents "/etc/toto.conf"; (* this feature can be disabled *) some_subtree = include "/etc/bidule.conf"; (* can be disabled too *) } For simple (key,data) bindings, you need to write something like this: { maxlines = 100; server = "machin.toto.org" } You must write annoying semicolons and one pair of curly brackets, but the leaves are typed and it is easy to express trees and lists if needed. Martin |
From: <Ala...@en...> - 2004-05-02 21:53:37
|
On Sun, 2 May 2004, Bardur Arantsson wrote: > I was just looking through my myriad of files in /etc, and > you know what... > > I could only find *ONE* configuration file that used a > hierarchical structure. That must mean something... We should define what we call hierarchical. For me, a .ini file has a flat but still hierarchical organization. In the exim config file, you have sections (ended by the keyword "end", and some items in sections are compound items, with several key-value bindings). This is already a two-level hierarchy. Similarly, the XF86Config file has sections and sub-sections. > > An XML syntax could be: > > > > <sshd_config > > port="22" > > hostkey="/etc/ssh_host_key" > > ... > > /> > > > > Is it really that complex ? > > It's _unneeded_ complexity that I have a problem with. Not > coplexity per se. So: has the example above unneeded complexity ? > If you changed the sshd_config like you suggested, then > you would now need to list all possible settings twice: > Once in the DTD and in the actual code which parses the > file. That's just bad software engineering practice... I'm not necessarily advocating the use of DTD. With .ini, you won't have any formal description of the validity constraints either. > MTAs: Exim uses it's own format and thank $DEITY for that. > It's simple and to the point -- In fact I'd go so far as > to call it beatiful (at least relative to all other mail > servers I've tried). If exim.conf, sections are just closed by the "end" keyword. They have no name. So you can find: end end end Go figure out in which section you are. From my exim.conf: # This file is divided into several parts, all but the last of which are # terminated by a line containing the word "end". The parts must appear # in the correct order, and all must be present (even if some of them are # in fact empty). Blank lines, and lines starting with # are ignored. I wouldn't call this beautiful. > Web server: Apache doesn't use XML. Yes, it's > hierarchical, but their parser can actually distinguish > between when something is supposed to be an int and not a > string (wihtout having to do a second pass as it were). > The reason that it's hierarchical is NOT virtual hosts -- > they cannot be nested, but the recursive nature of > permissions on DIRECTORIES. So there. <IfModule> directives can also be nested. And as I said, there can be hierarchy without recursion (and this is the case). > ml-donkey: No idea about this one... ml-donkey uses the Options module which was also released in Cameleon. The API may be worth a look: http://pauillac.inria.fr/~guesdon/Tools/cameleon/manual/Options.html What about using the syntax of OCaml values for config files: { port = 22; protocol = [2; 1]; host_keys = [ "/etc/ssh/ssh_host_key"; "/etc/ssh/ssh_host_rsa_key" ]; keep_alive = false } ? -- Alain |
From: Bardur A. <oca...@sc...> - 2004-05-02 23:16:47
|
On Sun, May 02, 2004 at 11:53:16PM +0200, Ala...@en... wrote: > On Sun, 2 May 2004, Bardur Arantsson wrote: > > > I was just looking through my myriad of files in /etc, and > > you know what... > > > > I could only find *ONE* configuration file that used a > > hierarchical structure. That must mean something... > > We should define what we call hierarchical. For me, a .ini file has a flat > but still hierarchical organization. In the exim config file, you have > sections (ended by the keyword "end", and some items in sections are > compound items, with several key-value bindings). This is already a > two-level hierarchy. Similarly, the XF86Config file has sections and > sub-sections. > You're right; it's useless to debate something without defining things properly... But just for the record: It was a poor choice of words of my part. I actually meant it in the sense of "structurally recursive", i.e. something which can be nested arbitrarily deeply and still make sense -- for anything else, I think (blatant opinion follows! :)) that .ini is superior to XML. Btw, good call on XF86Config (xorg.conf in my case), the single file I found was something else (can't remember what it was now). Even so, that file is not structurally recursive, it just has sections and subsections; AFAIK it doesn't have subsubsections (or any other structure which can be nested arbitrarily deeply). > > > > An XML syntax could be: > > > > > > <sshd_config > > > port="22" > > > hostkey="/etc/ssh_host_key" > > > ... > > > /> > > > > > > Is it really that complex ? > > > > It's _unneeded_ complexity that I have a problem with. Not > > coplexity per se. > > So: has the example above unneeded complexity ? Yes, there is markup which is not needed -- if the structure is flat, there is no reason to have an "sshd_config" element. It's implied. > > > If you changed the sshd_config like you suggested, then > > you would now need to list all possible settings twice: > > Once in the DTD and in the actual code which parses the > > file. That's just bad software engineering practice... > > I'm not necessarily advocating the use of DTD. With .ini, > you won't have any formal description of the validity > constraints either. True, but that is beside the point: With .ini you are at least not artificially separating the validation into two parts: Structural validation and semantic validation. It might sound like a good idea to separate them, but in practice the structural soundness and semantic soundness are actually so tightly linked that it doesn't actually make sense to separate them. The trouble is that people assume that structural soundness = semantic soundness and so assume that everything will suddenly and magically become easier when using XML. This is also why "interchanging" XML is not really as easy as XML proponents like to make it sound. Sure, you can tell if it's syntactically valid, but you have no way of knowing whether it's semantically valid. > > > MTAs: Exim uses it's own format and thank $DEITY for that. > > It's simple and to the point -- In fact I'd go so far as > > to call it beatiful (at least relative to all other mail > > servers I've tried). > > If exim.conf, sections are just closed by the "end" keyword. They have no > name. So you can find: > > end > end > end > > > Go figure out in which section you are. The same applies to any XML document where an element can be nested... ... </ol> </ol> </ol> <--- which list is this?? When I want to find a section I usually also start by searching for the section name -- My guess is that that is what any reasonable person would do. > > From my exim.conf: > # This file is divided into several parts, all but the last of which are > # terminated by a line containing the word "end". The parts must appear > # in the correct order, and all must be present (even if some of them are > # in fact empty). Blank lines, and lines starting with # are ignored. > > I wouldn't call this beautiful. Beaty is of course entirely subjective, but I actually like it... Then again, I also like LISP. :) > > > Web server: Apache doesn't use XML. Yes, it's > > hierarchical, but their parser can actually distinguish > > between when something is supposed to be an int and not a > > string (wihtout having to do a second pass as it were). > > The reason that it's hierarchical is NOT virtual hosts -- > > they cannot be nested, but the recursive nature of > > permissions on DIRECTORIES. So there. > > <IfModule> directives can also be nested. And as I said, there can be > hierarchy without recursion (and this is the case). Yes, again I meant "structurally recursive". Sorry for not being clear about that. I'm actually not sure that <IfModule> is actually required. It might have been better if the Configuration parser just emitted a warning on unknown settings/sections and otherwise ignored them. As an aside, IMNSHO one of the best examples of how to implement this sort of format extensibility is the chunk naming convention of PNG -- of course it would be slightly different syntactically, but it would do away with having to use the IfModule construct... > > > ml-donkey: No idea about this one... > > ml-donkey uses the Options module which was also released in Cameleon. > The API may be worth a look: > http://pauillac.inria.fr/~guesdon/Tools/cameleon/manual/Options.html > > > What about using the syntax of OCaml values for config files: > > { > port = 22; > protocol = [2; 1]; > host_keys = [ "/etc/ssh/ssh_host_key"; > "/etc/ssh/ssh_host_rsa_key" ]; > keep_alive = false > } > That actually looks reasonably sane, but I sort of feel that you're burdoning the user needlessly by having them specify the types of the various settings... Remember: Just because something is a string does not necessarily mean that it is a file name or that said file exists, so the library's user (ie. the app programmer) always has to double check anyway. If the program checks anyways, why should the user have to specify what type of data they are entering? Of course, the user has to know that some particular configuration setting takes string values (simply because they have to know the semantics of the setting the they're changing), but they shouldn't have to *specify* the types redundantly. Such "type specification" may be necessary for disambiguating the input, of course, but that cannot be avoided in some cases (just consider a filename containing a "," in the following example). Example: [server] port = 22 protocol = 2, 1 host_keys = /etc/ssh/..., /etc/ssh/... keep_alive = false [client] protocol = 2 ... This format would require less of the user and would still be perfectly computer-parsable. In fact, I might be tempted to implement something like this if I weren't so busy right now... Btw, I'm not saying that .ini is the be-all and end-all of configuration file formats, but I *am* saying that it's better than XML for almost any *practical* (configuration-format!) situation you can think of. If there's one thing I've learned through all those years at university it's that you should always, always, always optimize for the common case (i.e. is a simple non-recursive configuration format), but still make the uncommon cases _possible_ (which they actually are with ini files). I still don't think we've heard any convincing arguments *against* .ini. -- Bardur Arantsson <ba...@im...> <ba...@sc...> A computer lets you make more mistakes faster than any other invention, with the possible exceptions of handguns and Tequila. Mitch Ratcliffe |
From: <Ala...@en...> - 2004-05-02 23:59:12
|
On Mon, 3 May 2004, Bardur Arantsson wrote: > But just for the record: It was a poor choice of words of > my part. I actually meant it in the sense of "structurally > recursive", i.e. something which can be nested arbitrarily > deeply and still make sense -- for anything else, I think > (blatant opinion follows! :)) that .ini is superior to > XML. So what about hierarchical but not recursive structure ? I mean, where you just want different level of grouping (like XF86Config). > Yes, there is markup which is not needed -- if the > structure is flat, there is no reason to have an > "sshd_config" element. It's implied. Putting more explicit information may be a good thing in some situations. For instance, the user could be allowed to put the configuration for several (related) programs into a single file or into many files, etc... > True, but that is beside the point: With .ini you are at > least not artificially separating the validation into two > parts: Structural validation and semantic validation. If you don't use DTD (nor XML Schema, Relax-NG, etc...), why would you do that with XML more than with .ini ? > > If exim.conf, sections are just closed by the "end" keyword. They have no > > name. So you can find: > > > > end > > end > > end > > > > > > Go figure out in which section you are. > > The same applies to any XML document where an element can > be nested... > > ... > </ol> > </ol> > </ol> <--- which list is this?? You have an explicit opening element in XML, and a decent editor mode will easily show it. Btw, are there emacs/vi modes for .ini ? > When I want to find a section I usually also start by > searching for the section name -- My guess is that that is > what any reasonable person would do. But sections have no names (nor opening markup) in exim.conf. > Such "type specification" may be necessary for > disambiguating the input, of course, but that cannot be > avoided in some cases (just consider a filename containing > a "," in the following example). > > host_keys = /etc/ssh/..., /etc/ssh/... So will this be parsed as one or two file names ? Does the file name include the leading whitespace after the = sign ? (These are rhetorical questions, of course.) Forcing explicit delimiters and avoiding ad hoc and context sensitive parsing would provide a more robust syntax. If you ever want to write a emacs/vi mode, or a custom (generic) editor for config files, you'd better have uniform parsing rule. > I still don't think we've heard any convincing arguments > *against* .ini. - Cannot deal with hierarchical grouping, even if not recursive - No support for Unicode - No support in existing tools (AFAIK), like editors or editor modes - No widely accepted formal syntactic conventions (?) Ok, these might not be convincing for you ;-) Alain |