From: skaller <sk...@us...> - 2004-12-17 13:07:45
|
On Fri, 2004-12-17 at 22:36, Bardur Arantsson wrote: > > Doing things 'the right way' can't be overkill can it? > > I would expect this to be lightning fast and it should > > make it easy to generalise/extend ..? > > I'm not sure there will ever be any need to > generalise/extend, but anyway... Consider: (a) C has certain rules for resolving #include filenames. Recall eveb "kjkjh" and <kjhgkjg> are distinct .. (b) Felix (and Interscript) also have rules for resolving filenames. Interscript rules are in fact quite interesting: Names of include files in interscript are *required* to be Unix relative filenames. From the command line, a native filename prefix is taken. The two are spliced together, after converting the interscript name to a native one. [This ensures everything 'in document' is as OS independent as possible] (c) There are several conventions for PATH names. The unix one (separator ':') is one, but TeX uses kpathsea .. (d) I have used an archaic system which is much better than any of the above .. a TI os which has no subdirectories at all. Instead it has something much better: an environment plus structured filename convention. Filename components are replaced from the environment, subsuming the current directory idea completely. (e) Hmm what about URL/URI things .. :) In any case the idea of using a parser for filenames isn't overkill IMHO .. on the contrary the problem is more likely to be that mere LALR1 parsing and Ocamllex lexing simply isn't good enough (some OS use UCS-2 filenames .. solaris and Win32 for example ..) Ocamllex, for example, can't even translate UTF-8 (I tried once, it blows the lexer generators brains out). The real advantage of a lexer/parser combination seems to be that the specification is heavily declarative. > To my mind, using full-blown parsers is overkill for > splitting UNIX paths into their constituent parts, But we're not restricted to Unix.. > Another concern, which is more related to the interface is > that the module seems to raise exceptions in situations > one wouldn't normally expect. An an example: > > FilePath.check_extension p ext > > raises an exception if the filename doesn't have an > extension. I can't tell whether this is a part of the > interface Agree. I think exceptions should be reserved for unrecoverable errors if possible, but in any case it should be documented. > Apart from that, I feel that simple 'shortcut' path > queries like is_dir, is_link, etc. should be added to the > FilePath module. I realize that this removes the > separation of purely abtract paths and concrete files, but > it's just too convenient to pass up IMO. I think the separation is an *essential* feature. It's there by design and for a good reason: you can have Unix, MacOS, and Win32 modules all available at once. These modules must not be permitted to touch the file system. However short version in the FileUtil module, OR, a third module, (eg FileQuery) may make sense. (Especially as 'fstat' and friends are fairly OS specific animals) > All of the above is stuff that's fixable, so I guess the > best idea would be to just decide on an answer to the > question > > Do we want a path/file query/manipulation module in ExtLib? > > My answer would definitely be 'yes' Mine too. Filenames are needed even in Pervasives .. > Maybe I'm just stupid, but I don't see why a test harness > would require lots of work...? Because it has to (a) Collate all the tests (b) Run the tests -- terminating rogues (c) Collate the results (d) Standardise a way to actually report results Point (d) is extremely difficult. > Of course, writing individual test cases for all the > modules would be a lot of work, but this is work that can > be done incrementally. Yes, I don't think writing the tests is the issue. We can start with just half a dozen, and make sure for every bug there is a regression test generated. > Am I missing something? Yeah -- designing a test system is probably harder than designing a library or an average application. This is especially the case for Ocaml I suspect, since it doesn't support dynamic loading and unloading. That seems to mean each test must be a separate process. In addition some tests are dangerous, especially ones that mess with the filesystem or network -- usually you'd run your tests in a suitably restricted environment as an low privilege user.. which makes the test harnes OS specific .. In addition, we will need community support: something like a web page for submitting tests, so that they're easy to install, and meet requirements, such as having a description, expected output, etc .. I personally think 'unit' testing is a bit silly. It rarely finds bugs because it can't cover enough cases, can't handle integration, etc.. I prefer a more sloppy concept of just collecting whatever test code you can and running it, just to get some confidence you didn't completely mess something up whilst committing a minor change to CVS. -- John Skaller, mailto:sk...@us... voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net |