From: Andrew K. <ak...@mi...> - 2001-09-19 15:02:16
|
For comparison purposes I present below the compilation model we use=20 in MLj and SML.NET. As new versions of both these compilers are currently=20 unavailable I guess that not many people are familiar with our model. Personally I dislike even having to list explicitly the files making up a project, though I recognise that for large projects the module-level=20 namespace management and library support etc that CM provides are very=20 valuable. Our model consists of the following: (A) Agree on some way to map top-level SML Module identifiers (for=20 structures, functors and signatures) to full file names identifying the=20 file that contains the single binding for that Module entity (structure, functor or signature). (B) Supply a set of "root" SML structure identifiers. (C) Let the SML compiler figure out the rest i.e. do sufficient parsing, mapping of top-level identifiers to filenames, and dependency analysis to figure out which files to compile and in what order. For simple projects whose files all reside in a single directory and whose names conform to standard conventions, all the user needs to do is type=20 "make Main" (or whatever) and the compiler does the rest. For more complex projects the mapping (A) breaks down into (1) A PATH-like list of directories. (2) A suffix convention (.sig for signatures, .sml for structures and functors). (3) A means of overriding the convention to map particular identifiers to particular files. (4) A further means of overriding the convention to supply file names that may contain multiple top-level module bindings (as CM allows). We have successfully applied the scheme to the compiler itself, the most unpleasant aspect being the use of SML/NJ libraries whose files have an informal naming=20 convention which could very nearly be formalized but which currently must be mapped=20 explicitly by the programmer e.g. IntRedBlackSet |-> int-redblack-set.sml. - Andrew. > -----Original Message----- > From: Matthias Blume [mailto:bl...@re...]=20 > Sent: Wednesday, September 19, 2001 3:28 PM > To: Daniel C. Wang > Cc: Stephen Weeks; sml...@li... > Subject: Re: [Sml-implementers] compilation management >=20 >=20 > "Daniel C. Wang" wrote: > >=20 > > At 05:01 PM 9/18/2001 -0700, Stephen Weeks wrote: > > >{stuff deleted} > > >Here is my main requirement for a standard compilation management=20 > > >system that specifies how to build applications: > > > > > >>From a description of an application, it should be easy=20 > to extract=20 > > >>an SML > > >program consisting of all and only the modules that are needed for=20 > > >the application. (This extraction may require renaming=20 > modules.) In=20 > > >fact, that is one way to define the dynamic semantics of the=20 > > >compilation management system -- by translating the description of=20 > > >the application into an SML program. > > > > > >By "easy", I mean that the extraction should be possible without=20 > > >looking at any SML sources, only by looking at the application and=20 > > >library description files. > >=20 > > Let's just keep things simple, and get everyone to adopt CM. I've=20 > > talked to Matthias in the past about having a "dump=20 > everything to one=20 > > big SML file" option for CM. It's not that hard to do. Making CM=20 > > independent of the underlying SML/NJ compiler will probably=20 > take some=20 > > more work, but much less work than trying to reinvent yet another=20 > > compilation system. > >=20 > > This suggestion violates Steven's "easy" requirement.=20 > However, I think=20 > > that by making things "easy" for the tool it must make=20 > things harder=20 > > for the user. I'm happy as things are now.. where I just=20 > list a bunch=20 > > of SML files in any old order in my sources.cm and type CM.make(). >=20 > Obviously, I have a natural bias in favor of CM. :-) But I=20 > actually agree with Steve (and others) to some degree. =20 > Getting everyone to adopt CM is a bit too ambitious, I think.=20 > More realistic is a simple exchange format that specifies=20 > precisely which files get compiled in the context of what=20 > bindings imported from where. I have ideas as to what this=20 > format could look like. As you might imagine, CM internally=20 > computes all this information anyway, so dumping it into an=20 > ASCII file will not be hard at all. >=20 > It is my plan to design a low-level format that I believe all=20 > SML implementors could easily support without having to do=20 > fancy things such as dependency analysis etc. The format will=20 > also provide the necessary information for the transformation=20 > into pure SML that we described in our TOPLAS paper (where it=20 > was also used to define the dynamic semantics of CM). >=20 > > Even for very larger systems, I see no advantage to having the user=20 > > provide any more information than a list of files and perhaps an=20 > > export list of identifiers. >=20 > Right. I see no reason to give up CM or its model in SML/NJ.=20 > But I can modify its implementation in such a way that a=20 > more explicit low-level interchange format can be produced=20 > (and even used internally) by it. >=20 > > I am a CM bigot... so perhaps there are some strong=20 > concrete technical=20 > > reasons I'm ignoring... but unless there are I don't see any reason=20 > > not to just adopt CM*. >=20 > I am all for adopting CM. But the current implementation=20 > will probably be hard to port because it "knows" an awful lot=20 > about SML/NJ. Plus, the bulk of the code is not concerned=20 > with simple dependency analysis anyway. (I can imagine=20 > ripping out the dependency analyzer, but that's on the order=20 > of less than 500 lines of code anyway.) >=20 > Matthias >=20 > _______________________________________________ > Sml-implementers mailing list Sml...@li... > https://lists.sourceforge.net/lists/listinfo/sml-implementers >=20 |
From: Robert H. <Rob...@cs...> - 2001-09-19 21:28:21
|
Just to be sure we're on the same page, let me say that I would not want to standardize on a TOOL, but rather a LANGUAGE for expressing separate compilation. In particular, saying that CM could be made to generate one huge source file for those of us that don't implement CM is a non-starter. An anology: CM is like type inference; we want to specify the underlying explicitly typed syntax. We will send a brief summary of our compilation mechanism to this list shortly. Bob -----Original Message----- From: Matthias Blume [mailto:bl...@re...] Sent: Wednesday, September 19, 2001 11:09 AM To: Andrew Kennedy Cc: Daniel C. Wang; Stephen Weeks; sml...@li... Subject: Re: [Sml-implementers] compilation management Andrew Kennedy wrote: > > (A) Agree on some way to map top-level SML Module identifiers (for > structures, functors and signatures) to full file names identifying the > file that contains the single binding for that Module entity (structure, > functor or signature). Sorry. I will NEVER agree to that! We had this discussion before in an OCaml vs. SML context. A naming convention such as the above can be used by an implementation, but it should _not_ be the common ground on which we all live. It is trivial to map implicit file naming such as the one you suggest to explicit naming, but not vice versa. As I said, I will try to come up with a simple (although perhaps verbose) and very explicit description format that we all can implement. Systems such as CM or your implementation can take whatever scheme they use and _generate_ the explicit format. It is unlikely that we will ever agree on a high-level format (I for one will never agree to a modulename->filename mapping scheme, and others seem to think that such as scheme is the only way they can accept), so abstracting from this issue is the only way to go. Matthias _______________________________________________ Sml-implementers mailing list Sml...@li... https://lists.sourceforge.net/lists/listinfo/sml-implementers |
From: David S. <sw...@cs...> - 2001-10-05 18:25:20
|
Here is our description of TILT's compilation management system. Comments and suggestions are appreciated. David Swasey and Tom Murphy On Wed, 19 Sep 2001 17:27:38 EDT, Robert Harper wrote: > We will send a brief summary of our compilation mechanism to this list > shortly. |
From: John H. R. <jh...@re...> - 2001-09-19 21:53:41
|
In message <C70...@EX...>, Robert Harper writes: > > Just to be sure we're on the same page, let me say that I would not want to > standardize on a TOOL, but rather a LANGUAGE for expressing separate > compilation. In particular, saying that CM could be made to generate one > huge source file for those of us that don't implement CM is a non-starter. > > An anology: CM is like type inference; we want to specify the underlying > explicitly typed syntax. This is what Matthias is proposing. An explicit syntax for describing how to assemble libraries/applications from sources. CM can be used to generate such a description from a list of sources. - John |
From: Andrew K. <ak...@mi...> - 2001-09-19 23:14:49
|
I totally agree. Sorry I wasn't clearer in my message: I was just offering a description of our own compilation model as another *example* not as a proposal for adoption. I like Bob's analogy with type inference - we need a common means of expressing each compiler's particular model. - Andrew. > -----Original Message----- > From: Robert Harper [mailto:Rob...@cs...]=20 > Sent: Wednesday, September 19, 2001 10:28 PM > To: 'Matthias Blume'; Andrew Kennedy > Cc: Daniel C. Wang; Stephen Weeks;=20 > sml...@li... > Subject: RE: [Sml-implementers] compilation management >=20 >=20 > Just to be sure we're on the same page, let me say that I=20 > would not want to standardize on a TOOL, but rather a=20 > LANGUAGE for expressing separate compilation. In particular,=20 > saying that CM could be made to generate one huge source file=20 > for those of us that don't implement CM is a non-starter. >=20 > An anology: CM is like type inference; we want to specify the=20 > underlying explicitly typed syntax. >=20 > We will send a brief summary of our compilation mechanism to=20 > this list shortly. >=20 > Bob >=20 > -----Original Message----- > From: Matthias Blume [mailto:bl...@re...] > Sent: Wednesday, September 19, 2001 11:09 AM > To: Andrew Kennedy > Cc: Daniel C. Wang; Stephen Weeks;=20 > sml...@li... > Subject: Re: [Sml-implementers] compilation management >=20 >=20 > Andrew Kennedy wrote: > >=20 > > (A) Agree on some way to map top-level SML Module identifiers (for=20 > > structures, functors and signatures) to full file names identifying=20 > > the file that contains the single binding for that Module entity=20 > > (structure, functor or signature). >=20 > Sorry. I will NEVER agree to that! >=20 > We had this discussion before in an OCaml vs. SML context. A=20 > naming convention such as the above can be used by an=20 > implementation, but it should _not_ be the common ground on=20 > which we all live. It is trivial to map implicit file naming=20 > such as the one you suggest to explicit naming, but not vice versa. >=20 > As I said, I will try to come up with a simple (although=20 > perhaps verbose) and very explicit description format that we=20 > all can implement. Systems such as CM or your implementation=20 > can take whatever scheme they use and _generate_ > the explicit format. It is unlikely that we will ever agree on a > high-level > format (I for one will never agree to a modulename->filename=20 > mapping scheme, and others seem to think that such as scheme=20 > is the only way they can accept), so abstracting from this=20 > issue is the only way to go. >=20 > Matthias >=20 > _______________________________________________ > Sml-implementers mailing list Sml...@li... > https://lists.sourceforge.net/lists/listinfo/sml-implementers >=20 |
From: Daniel C. W. <dc...@ag...> - 2001-09-19 23:57:34
|
At 05:27 PM 9/19/2001 -0400, you wrote: >Just to be sure we're on the same page, let me say that I would not want to >standardize on a TOOL, but rather a LANGUAGE for expressing separate >compilation. In particular, saying that CM could be made to generate one >huge source file for those of us that don't implement CM is a non-starter. > >An anology: CM is like type inference; we want to specify the underlying >explicitly typed syntax. This is a good analogy, because it lets me emphasizes my point about having one consistent standard build-inference system. Programing without build inference is like programming in ML without type inference. End users of ML would not be happy if every system had different incompatible type inference engines. The build language is just a first step. There should be a standard build inference system too. (e.g. CM because its there and it works.... ) |
From: Dave B. <da...@ta...> - 2001-09-20 08:05:02
|
(I see you've beaten me to some of the questions I ask in this message). Some questions about compilation management. 1. How much do we want "backwards compatibility" with each existing implementation? On the one hand, we want to minimise the work needed to adopt the new system. On the other hand, everyone will have to make some changes, and there might be benefits from requiring more than just the minimum. 2. Shall the new system require a top level? Preferably not, IMO. 3. Should we adopt a common suffix for file names? E.g. ".sml" for structures and functors; ".sig" for signatures. The advantage is uniformity in tools. The disadvantage is more changes for some users. 4. Should we require one module per file and use the same name for the module and the file? The advantage is that it's easy to find modules, easy to explain to new users, and a library file doesn't have to choose between listing modules and files. The disadvantage is more changes for some users. I'm strongly in favour of this. Apparently some people see this as a gross imposition. I don't even begin to comprehend why they feel this way, but I guess we have to accept their view. 5. Should we make any file into a module? I.e. if a file contains a set of top-level declarations, this would be equivalent to defining a structure with the same name as the file. The advantages are that novices are using the module system as soon as they start using files, and the module system is both natural to use and easy to explain, and we don't have to worry about supporting "core-only" programs. The disadvantage is that users have to add the structure header when adding an explicit signature constraint, or when writing a functor. OCaml gets a huge win from this simple mapping from files to modules; I'm sure this is one of the design decisions that makes it popular. OCaml doesn't yet support functors or generic signatures as cleanly, but we can do better. 6. Should we support multiple configurations -- e.g. release mode and debug mode, where the two modes apply a certain functor to different arguments? 7. Should we support conditional compilation? 8. Should we standardise the semantics of "use", for those systems that provide a top level? Peter Sestoft analysed the different implementations at the time of SML'97. |
From: Peter S. <se...@di...> - 2001-09-20 14:20:49
|
> 8. Should we standardise the semantics of "use", for those systems that > provide a top level? Peter Sestoft analysed the different implementations > at the time of SML'97. Maybe this is worth addressing, since the issue was originally raised by a real user of SML (Larry Paulson, I think), who found the differences annoying. As far as I recall, in 1997 there was pretty good agreement between implementations, after some changes had been made to MLWorks. At a later point SML/NJ drifted away slightly, but I'd have to look at the details again. Peter -- Department of Mathematics and Physics * http://www.dina.kvl.dk/~sestoft/ Royal Veterinary and Agricultural University * Tel +45 3528 2334 Thorvaldsensvej 40, DK-1871 Frederiksberg C, Denmark * Fax +45 3528 2350 |
From: John H. R. <jh...@re...> - 2001-09-20 12:55:55
|
In message <4.1...@po...>, Dave Berry writes: > > (I see you've beaten me to some of the questions I ask in this message). > > Some questions about compilation management. > > 1. How much do we want "backwards compatibility" with each existing > implementation? On the one hand, we want to minimise the work needed to > adopt the new system. On the other hand, everyone will have to make some > changes, and there might be benefits from requiring more than just the > minimum. The important thing is to make it easy for library/application authors to target the interchange format. The build mechanism needs to be compatible with the SML implementations, but I don't think that we should worry too much about compatibility with existing build mechanisms. > > 2. Shall the new system require a top level? Preferably not, IMO. I agree. > > 3. Should we adopt a common suffix for file names? E.g. ".sml" for > structures and functors; ".sig" for signatures. The advantage is > uniformity in tools. The disadvantage is more changes for some users. No. The design should not introduce unnecessary barriers to porting existing libraries. > > 4. Should we require one module per file and use the same name for the > module and the file? The advantage is that it's easy to find modules, > easy to explain to new users, and a library file doesn't have to choose > between listing modules and files. The disadvantage is more changes for > some users. > > I'm strongly in favour of this. Apparently some people see this as a gross > imposition. I don't even begin to comprehend why they feel this way, but I > guess we have to accept their view. Because it adds an uneccesary restriction on my programming. There are a number of things that I do regularly, which are incompatible with this requirement. I often want to put a signature and implementation in the same file, have multiple structure definitions in a file, or have multiple implementations of the same module in a directory (see conditional compilation below). Also, the language allows arbitrary-length structure names, whereas some operating systems impose tight restrictions on file-name lengths. > 5. Should we make any file into a module? I.e. if a file contains a set > of top-level declarations, this would be equivalent to defining a > structure with the same name as the file. The advantages are that novices > are using the module system as soon as they start using files, and the > module system is both natural to use and easy to explain, and we don't have > to worry about supporting "core-only" programs. The disadvantage is that > users have to add the structure header when adding an explicit signature > constraint, or when writing a functor. > > OCaml gets a huge win from this simple mapping from files to modules; I'm > sure this is one of the design decisions that makes it popular. OCaml > doesn't yet support functors or generic signatures as cleanly, but we can > do better. I don't see a strong need for this change. > > 6. Should we support multiple configurations -- e.g. release mode and > debug mode, where the two modes apply a certain functor to different arguments? > > 7. Should we support conditional compilation? We have found conditional compilation useful in CM, but mostly as a way to support multiple versions of the compiler in a single sources.cm file. I suspect that once a library is targeting multiple implementations of SML, it will need some mechanism to customize the build process. Such a mechanism could address #6 too. > 8. Should we standardise the semantics of "use", for those systems that > provide a top level? Peter Sestoft analysed the different implementations > at the time of SML'97. > > > > _______________________________________________ > Sml-implementers mailing list > Sml...@li... > https://lists.sourceforge.net/lists/listinfo/sml-implementers |
From: Dave B. <da...@ta...> - 2001-09-20 20:46:42
|
At 08:54 20/09/2001, John H. Reppy wrote: >I often want to [...] have multiple >implementations of the same module in a directory (see conditional compilation >below). That's an interesting point. >Also, the language allows arbitrary-length structure names, whereas >some operating systems impose tight restrictions on file-name lengths. Do such OS's still exist? The main one was MS/DOS, of course, but I don't think many people still use that. I don't know anything about PalmOS, Epoch, etc; do they have tight limits? Dave. |
From: Matthias B. <bl...@re...> - 2001-09-20 15:23:42
|
Dave Berry wrote: > > (I see you've beaten me to some of the questions I ask in this message). > > Some questions about compilation management. > > 1. How much do we want "backwards compatibility" with each existing > implementation? On the one hand, we want to minimise the work needed to > adopt the new system. On the other hand, everyone will have to make some > changes, and there might be benefits from requiring more than just the > minimum. I think that the mechanism that we will hopefully agree on is going to be low-level enough to accommodate everyone's needs. > 2. Shall the new system require a top level? Preferably not, IMO. That's a totally different question. I prefer supporting an interactive toplevel in form of an application library. > 3. Should we adopt a common suffix for file names? E.g. ".sml" for > structures and functors; ".sig" for signatures. The advantage is > uniformity in tools. The disadvantage is more changes for some users. I don't think that we should _require_ particular suffixes. > 4. Should we require one module per file and use the same name for the > module and the file? The advantage is that it's easy to find modules, > easy to explain to new users, and a library file doesn't have to choose > between listing modules and files. The disadvantage is more changes for > some users. No, absolutely not. > I'm strongly in favour of this. Apparently some people see this as a gross > imposition. I don't even begin to comprehend why they feel this way, but I > guess we have to accept their view. Here are a few reasons: -- I (and others) often write multiple "modules" in one file (e.g., signature and functor). -- File systems are often limited in what names you can choose. The limitations are not uniform from OS to OS (or even just filesystem to filesystem). -- Not requiring a fixed mapping is more flexible during program development. -- In a world without fixed mapping, a fixed mapping can always be adopted using convention when this is desired. But the other way around this does not work. -- The benefits from having a fixed mapping are, IMO, rather marginal. > 5. Should we make any file into a module? I.e. if a file contains a set > of top-level declarations, this would be equivalent to defining a > structure with the same name as the file. Again, absolutely not! (This is the same thing as above just with one or two fewer lines in the source file.) > OCaml gets a huge win from this simple mapping from files to modules; I'm > sure this is one of the design decisions that makes it popular. Hmm. "Huge"? How do you measure that? > 6. Should we support multiple configurations -- e.g. release mode and > debug mode, where the two modes apply a certain functor to different arguments? This can be taken care of by conditional compilation. > 7. Should we support conditional compilation? Probably. (CM does.) > 8. Should we standardise the semantics of "use", for those systems that > provide a top level? Peter Sestoft analysed the different implementations > at the time of SML'97. If we adopt the point of view that the interactive toplevel is an application library, then this is clearly a problem of that library. Thus, maybe we can defer this decision. Matthias |
From: Dave B. <da...@ta...> - 2001-09-20 22:24:16
|
Here's another question, which you may find more relevant. What limitations do each existing system place on the contents of source files? E.g. CM used to have some limits in order to simplify dependency analysis -- I think it banned the use of "open" at top level. I can imagine that other systems might require no redefinition at top-level, or no core-level constructs. If everyone can tell us what limitations their system requires (if any), then we can allow for them in the new design. Dave. |
From: John H. R. <jh...@re...> - 2001-09-20 20:55:56
|
In message <4.1...@po...>, Dave Berry writes: > > At 08:54 20/09/2001, John H. Reppy wrote: > >I often want to [...] have multiple > >implementations of the same module in a directory (see conditional compilation > >below). > > That's an interesting point. > > >Also, the language allows arbitrary-length structure names, whereas > >some operating systems impose tight restrictions on file-name lengths. > > Do such OS's still exist? The main one was MS/DOS, of course, but I don't > think many people still use that. I don't know anything about PalmOS, > Epoch, etc; do they have tight limits? Even though MS/DOS is mostly gone, FAT file systems still exist. Also, I think that the standard for CD Roms restricts file-name length and case (but I may be wrong). Another problem is that HFS+ file systems in MacOS X are case insensitive, which introduces more limits on programs. Using a fixed mapping between module names and filenames is a limitation that we do not need. - John |
From: Daniel C. W. <dc...@ag...> - 2001-09-20 21:30:08
|
At 04:55 PM 9/20/2001 -0400, John H. Reppy wrote: >In message <4.1...@po...>, Dave Berry writes: >> >> At 08:54 20/09/2001, John H. Reppy wrote: >> >I often want to [...] have multiple >> >implementations of the same module in a directory (see conditional compilation >> >below). >> >> That's an interesting point. >> >> >Also, the language allows arbitrary-length structure names, whereas >> >some operating systems impose tight restrictions on file-name lengths. >> >> Do such OS's still exist? The main one was MS/DOS, of course, but I don't >> think many people still use that. I don't know anything about PalmOS, >> Epoch, etc; do they have tight limits? > >Even though MS/DOS is mostly gone, FAT file systems still exist. Also, >I think that the standard for CD Roms restricts file-name length and case >(but I may be wrong). Another problem is that HFS+ file systems in >MacOS X are case insensitive, which introduces more limits on programs. > >Using a fixed mapping between module names and filenames is a limitation >that we do not need The case insensitivity is not just a problem with MacOS X. Under all versions of Windows (95/98/NT/2k) although the file system remembers the case of files names. It treats file names of differing case to be identical. So Foo.txt and foo.txt are the same file. Try creating the same file with different cases on a windows machine. I have had problems with this particular issue when generating code for Java which maps class name to file names at least in Sun's JDK. It is pain to have operating systems with broken semantics intrude on my programming environment. Doing a little digging confirms John's claims that IS09660 CD roms are limited to 8 . 3 file names and must be all upper case. i.e. DOS semantics. |
From: Dave B. <da...@ta...> - 2001-09-20 22:49:18
|
At 17:29 20/09/2001, Daniel C. Wang wrote: >The case insensitivity is not just a problem with MacOS X. ... I won't argue this point any more, since it's clear that sufficient people are sufficiently opposed to a fixed mapping. I will just note that library authors would do well to bear case-insensitivity in mind while choosing names for their files. We could even go so far as to require that no files in a given library should be case-insensitive-equal. Yes, it would be a mild restriction on some OS's, but it would help to improve portability. Dave. |
From: Peter S. <se...@di...> - 2001-09-21 06:10:06
|
On Thu, 20 Sep 2001, Dave Berry wrote: > At 17:29 20/09/2001, Daniel C. Wang wrote: > >The case insensitivity is not just a problem with MacOS X. ... > > I won't argue this point any more, since it's clear that sufficient people > are sufficiently opposed to a fixed mapping. I will just note that library > authors would do well to bear case-insensitivity in mind while choosing > names for their files. We could even go so far as to require that no files > in a given library should be case-insensitive-equal. I agree. Peter |
From: Matthias B. <bl...@re...> - 2001-09-21 14:11:52
|
Dave Berry wrote: > > At 17:29 20/09/2001, Daniel C. Wang wrote: > >The case insensitivity is not just a problem with MacOS X. ... > > I won't argue this point any more, since it's clear that sufficient people > are sufficiently opposed to a fixed mapping. I will just note that library > authors would do well to bear case-insensitivity in mind while choosing > names for their files. Yes, this is a good recommendation. > We could even go so far as to require that no files > in a given library should be case-insensitive-equal. Yes, it would be a > mild restriction on some OS's, but it would help to improve portability. This I wouldn't do -- not as a requirement that we burn into a language spec. The main reason for my saying this is that we should not have to care about such OS- and filesys-specifics at all. Otherwise there will be no end to it: - no case-insensitive-equal names - no suffixes longer than k chars (k = 3 ?) - no filename longer than n chars - no filename arc longer than m chars - no dots in filenames (other than to indicate the suffix) - no whitespace in filenames - only ASCII in filenames - no special chars in filenames ("/", "\", "*", "?", ...) - no ... It is a slippery slope, and we should not even begin to tread there. It is _definitely_ a good idea for library authors to keep all of the above in mind. But at the same time, it should be easy to fix problems that may occur on particular systems without having to touch ML source code, i.e., by just renaming files and updating the meta information (= .cm-files in CM jargon). Matthias |
From: Dave B. <da...@ta...> - 2001-10-07 17:00:17
|
At 10:10 21/09/2001, Matthias Blume wrote: >> We could even go so far as to require that no files >> in a given library should be case-insensitive-equal. ... > >This I wouldn't do -- not as a requirement that we burn into a language >spec. The main reason for my saying this is that we should not have to care >about such OS- and filesys-specifics at all. One thing to consider is whether to provide any support for directories in the file paths. If so, then either we need a convention for directory syntax, or people may have to rewrite their project files when they move between OSs. (Although I believe Windows now recognises Unix-style directory syntax -- i.e forward slash). If we do adopt such a convention, then we are already part way to limiting the names that can appear. >Otherwise there will be no end >to it: That depends; portability isn't all or nothing. As a concrete proposal, suppose that files had to use the following character set: [a-zA-Z0-9_+.-], they had to be case-insensitive, and directories were indicated with forward slashes. This would be portable across all versions of Windows (including CE, and including both FAT and NTFS), all versions of Unix/Linux, and (I think) MacOS. Projects might not port to OS's that limit the length of file names beyond the user's choice of names, or OS's that have other limitations; in these cases users would have to rewrite the file names. So this achieves a level of portability that is pragmatically useful, and is never worse than the alternative. Possibly the most annoying effect of this limitation would be on people who want to write filenames in an non-English language. But they already have to write all their identifiers in ASCII. I'm not strongly wedded to this argument, I just think it's worth considering. Dave. |
From: Doug C. <e...@fl...> - 2001-10-08 16:50:53
|
At 10:14 PM +0100 10/6/01, Dave Berry wrote: >[...] As a concrete proposal, >suppose that files had to use the following character set: >[a-zA-Z0-9_+.-], they had to be case-insensitive, and directories were >indicated with forward slashes. This would be portable across all versions >of Windows (including CE, and including both FAT and NTFS), all versions of >Unix/Linux, and (I think) MacOS. Projects might not port to OS's that >limit the length of file names beyond the user's choice of names, or OS's >that have other limitations; in these cases users would have to rewrite the >file names. So this achieves a level of portability that is pragmatically >useful, and is never worse than the alternative. This would work well with MacOS. Pathname and arc length limitations are an area where guidance to the user should be provided, and perhaps warnings issued by the compilation manager, to foster portability. For example, MacOS (version <= 9) limits path lengths to 255 (or 511) characters, and arcs to 32 characters. Other file system may not like '.' used arbitrarily in file names. MacOS doesn't care. At 2:42 PM +0200 10/8/01, Andreas Rossberg wrote: >Why not adopt an existing standard and use (a subset of) URI path >syntax? It is well-known, more or less system independent, and >expressive enough - and pleasently close to Unix syntax. Every OS >already has conventions to map it down to native paths. It is nice that URI provides a means for specifying both relative and absolute paths, including syntax for climbing a relative path hierarchy. The difficulty, of course, is defining a reasonable subset, since many kinds of non-portable path and file names can be written with URI. Dave Berry's character set and case-insensitivity, augmented with URI's relative path traversal, is about right. Add arc length limitations, and we'd have a very portable syntax. e |
From: Andreas R. <ros...@ps...> - 2001-10-08 12:43:40
|
Dave Berry wrote: > > One thing to consider is whether to provide any support for directories in > the file paths. If so, then either we need a convention for directory > syntax, or people may have to rewrite their project files when they move > between OSs. (Although I believe Windows now recognises Unix-style > directory syntax -- i.e forward slash). If we do adopt such a convention, > then we are already part way to limiting the names that can appear. Why not adopt an existing standard and use (a subset of) URI path syntax? It is well-known, more or less system independent, and expressive enough - and pleasently close to Unix syntax. Every OS already has conventions to map it down to native paths. - Andreas |
From: Matthias B. <bl...@re...> - 2001-10-08 18:27:33
|
Dave Berry wrote: > > At 10:10 21/09/2001, Matthias Blume wrote: > >> We could even go so far as to require that no files > >> in a given library should be case-insensitive-equal. ... > > > >This I wouldn't do -- not as a requirement that we burn into a language > >spec. The main reason for my saying this is that we should not have to care > >about such OS- and filesys-specifics at all. > > One thing to consider is whether to provide any support for directories in > the file paths. If so, then either we need a convention for directory > syntax, or people may have to rewrite their project files when they move > between OSs. (Although I believe Windows now recognises Unix-style > directory syntax -- i.e forward slash). If we do adopt such a convention, > then we are already part way to limiting the names that can appear. All this these suggestions and more are fine, but I still think that we should not burn them into any "standard" (even though we are no longer permitted to use the word "standard" :-). My suggestion would be the following 3-point plan: 1. Define a library specification format where file names are coded as generic strings, accommodating any possible OS- and implementation-specific name. I am actually working on the above. Please, have patience for a few more days until I am ready to unveil my creation. :) 2. Specify a portable pathname syntax, perhaps only for relative pathnames (but including the possibility of subdirectories and perhaps even "parent" arcs etc.) A library description as I am currently designing it will abstract over a "context" relative to which such pathnames are to be interpreted. My personal preference for "portable" syntax is Unix-style. 3. Provide an escape hatch from portable pathname syntax to "native" syntax for those who don't care about portability for one reason or another. So the point is that the specification language itself will be neutral to pathname conventions, but there will be strict guidelines that anyone should follow who wishes to create portable libraries. > Possibly the most annoying effect of this limitation would be on people who > want to write filenames in an non-English language. But they already have > to write all their identifiers in ASCII. Two answers to this: A. The "portable" syntax from above (point 2.) could and perhaps should allow for UTF-8 characters. (The software that will be processing these names does not care about the Unicode-semantics of individual characters with the exception of the arc separator '/' and possibly parent- and current arcs (".." and ".").) B. I wish we could write identifiers using UTF-8 as well. :-) -- -Matthias |
From: Andrew K. <ak...@mi...> - 2001-09-21 14:46:31
|
A point related to compilation management is that of "entry points". Different compilers have different means of indicating where the entry point to a standalone executable is, and this is a source of incompatibility when trying to compile SML'97 sources under different compilers. I can think of at least three: (a) Use-style, where "main" is essentially defined by side-effecting top-level definitions (e.g. val _ =3D run_my_program()). (b) C-style naming convention for the main function (e.g. fun main() =3D ...). (c) Compilation environment directive e.g. SML/NJ's exportFn command. It would be nice to agree on something here to save us all writing wrapper code every time. - Andrew.=20 |
From: Matthias B. <bl...@re...> - 2001-09-21 15:01:40
|
Andrew Kennedy wrote: > > A point related to compilation management is that of "entry points". > Different compilers have different means of indicating where the > entry point to a standalone executable is, and this is a source of > incompatibility when trying to compile SML'97 sources under different > compilers. I can think of at least three: > > (a) Use-style, where "main" is essentially defined by side-effecting > top-level definitions (e.g. val _ = run_my_program()). > > (b) C-style naming convention for the main function (e.g. fun main() = > ...). > > (c) Compilation environment directive e.g. SML/NJ's exportFn command. > > It would be nice to agree on something here to save us all writing > wrapper code every time. For those who are not aware of this, here is what SML/NJ implements these days. Maybe we can take this as a blueprint of what we want to do... We have a shell command "ml-build" which takes three arguments: ml-build <library> <functionname> <executable> The <library> is a .cm-file, and <functionname> is the name of a function exported from <library>. Its type must be string * string list -> OS.Process.status i.e., the type of "main". Finally, <executable> is the name of the stand-alone program to be constructed. In our case it currently names a heap image, but in the future it will probably name a genuine executable. The command ml-build will invoke the SML/NJ compiler (including CM), build the library in question if it was not already up-to-date, and then produce an executable with main entry point <functionname>. Internally, we use exportFn to establish the entry point, but this is hidden from the programmer and might even change in the near future. Maybe we can agree on something similar to the above, i.e., something that does not make specific claims about implementation details... Matthias |
From: Andreas R. <ros...@ps...> - 2001-09-21 15:46:03
|
Andrew Kennedy wrote: > > Different compilers have different means of indicating where the > entry point to a standalone executable is, and this is a source of > incompatibility when trying to compile SML'97 sources under different > compilers. I can think of at least three: > > (a) Use-style, where "main" is essentially defined by side-effecting > top-level definitions (e.g. val _ = run_my_program()). > > (b) C-style naming convention for the main function (e.g. fun main() = > ...). > > (c) Compilation environment directive e.g. SML/NJ's exportFn command. Choice (a) is preferable IMHO, since it is simplest to use when you don't care about command arguments or returning error codes. The Std Basis already allows you to access/return those when you need to, in a more convenient way. - Andreas -- Andreas Rossberg, ros...@ps... "Computer games don't affect kids; I mean if Pac Man affected us as kids, we would all be running around in darkened rooms, munching magic pills, and listening to repetitive electronic music." - Kristian Wilson, Nintendo Inc. |
From: Matthias B. <bl...@re...> - 2001-09-21 16:29:39
|
I am still not convinced about the need for an asymetric "where type". Consider the following scenario (which, in fact, has happened to us in similar form, and it was a very annoying experience). Suppose we have the following signatures A, B, C, and D: signature A = sig type a end signature B = sig type b end signature C = sig structure A : A structure B : B where type b = A.a * A.a end signature D = sig type d val f : d -> unit end Now, suppose further that we want to construct a signature E (perhaps the formal argument of a functor) consisting of two substructures that match C and D, respectively: signature E = sig structure C : C structure D : D end Further, we want to be able to apply D.f to values of type C.B.b, so we must specify that D.d and C.B.b are the same type. All I want to convey to the compiler is that D.d = C.B.b, but I can't do this -- neither using "sharing" nor using "where type" because C.B.b is not flexible anymore! So I have to say something like: signature E = sig structure C : C structure D : D where type d = C.A.a * C.A.a end In other words, the language forces me to trace back what C.B.b was defined to be -- even though this information is completely irrelevant to what I am trying to express. Moreover, it is intuitively clear that the compiler could infer the above automatically from the equation D.d = C.B.b. I see not conceptual difficulty with making "where type" (or any type abbreviation in signatures) symmetric. The meaning should be this: An attempt is made to "unify" the two types in question, with currently "flexible" type names playing the roles of type variables. - "unifying" two flexible type names generates a traditional sharing constraint for them, i.e., it throws the two names into an equivalence class - "unifying" a rigid type name, i.e., one that already has a definitonal spec, applies recursively to the RHS of its spec - "unifying" a flexible type name with a type constructor application generates a definitional spec for the name (and its equivalence class) - "unifying" two type constructor applications with equal head constructor causes the respective arguments to be "unified" recursively - "unifying" two type constructor applications with unequal head constructor causes elaboration to fail with an error message Why can't we have this? It seems simple and intuitive and expressive. Matthias |