From: Pascal J. B. <pj...@in...> - 2016-05-15 21:06:30
|
Daniel Jour <dan...@gm...> writes: > While updating the asdf module to 3.1.7, I found that our provide and > require functions are actually case sensitive. Given the ignorance of case > prevalent in Common Lisp, Oh no! Common Lisp is VERY MUCH Case Sensitive! cl-user> (eq '|foo| '|Foo|) nil cl-user> (make-package "Foo") #<Package "Foo"> cl-user> (make-package "foo") #<Package "foo"> cl-user> (eq (find-package "foo") (find-package "Foo")) nil cl-user> (eq #P"/tmp/Foo" #P"/tmp/foo") nil cl-user> (defparameter *h* (make-hash-table :test 'equal)) *h* cl-user> (setf (gethash "foo" *h*) 42) 42 cl-user> (gethash "Foo" *h*) nil nil cl-user> (gethash "foo" *h*) 42 t cl-user> There is no part of CL that is not very case sensitive. > I was surprised that the HyperSpec doesn't > contain any notes about case sensitivity for provide and require, thus I'd > assume that we're conforming to ANSI here, right? Given that ultimately the module name should be mapped to a file name, it would be dumb to specify a specific case sensitivity for module names. On the other given that the most common file systems are case sensitives, it would also be dumb to expect case insensitivity here. > All of this originated from the surprise when running CLISP with the new > asdf, because asdf now provides both "ASDF" and "asdf" (as well as "UIOP" > and "uiop"). Excerpt from asdf.lisp: > > ;; Provide both lowercase and uppercase, to satisfy more people, > especially LispWorks users. > (provide "asdf") (provide "ASDF") > > Consequently, *modules* now also contains both the lowercase and the > uppercase names. > > So far so good, where it really gets surprising is in the registered module > provider function asdf/operate::module-provide-asdf, because that function > (part of asdf.lisp) just downcases the module name right away. As mentionned above, there's a preference for unix file systems, that are case sensitive, and where the prefered case for file names is lowercase. Hence the string-downcase. > Thus, (require "foo"), (require "foO") and (require "FOO") will each try to > load (via asdf) the module "foo". *modules* will only contain what the module > actually provides, though. > > This feels very inconsistent to me. Indeed. > Some approaches: > > * Leave as it is. Not good, IMO. > * Make provide and require case insensitive. Good, IMO. > * Let asdf only provide either of "ASDF" or "asdf". Though this is more of a > cosmetic fix. > * Ask asdf about it. > > What are your thoughts on this? There's also basically the problem of the default mapping by an implementation of logical pathname components to physical pathname components. If the situation was clear there, then it would be quite simple to specify a direct mapping from the module name as a component of a pathname to a module, and therefore adopting the case sensitivity of the underlying file system. Usual logical pathnames components are specified to be one of: word---one or more uppercase letters, digits, and hyphens. wildcard-word---one or more asterisks, uppercase letters, digits, and hyphens, including at least one asterisk, with no two So, upper case. However, it is allowed to use lower case letters in logical pathname, being understood that string-upcase is taken. Now, mapping a logical pathname to a physical pathname is specified to use an implementation dependant mapping, but using the "native" file system conventions. Therefore it would seem logical that the default translation of logical pathnames to physical pathname on a unix (POSIX) platform, would involve a string-downcase. Unfortunately, not all implementations agree on this, hence the need in libraries and user code to either: - use explicit logical pathname translations, (this is inconvenient, since this would mean entering an translation entry for each pathname into the logical host translations). - allow the implementation use uppercase physical pathnames, while expecting a different mapping with different implementations, and therefore incompatibility between different implementations on unix (POSIX) platforms. - not use logical pathnames and the implementation dependant translations. Basically, this choices is to use only physical pathnames, and therefore let the user code perform the mapping from whatever name using whatever convention to the file system conventions. or, my favorite: - specify that the default mapping of logical pathnames to unix (or POSIX) physical pathnames should use string-downcase, and patch the bad implementations. Furthermore, notice what is said in section: 19.2.2.1.2.2 Common Case in Pathname Components On a unix file system, we should have: (make-pathname (pathname-host #P"/") :directory '(:absolute) :name "foo" :case :common) --> #P"/FOO" (make-pathname (pathname-host #P"/") :directory '(:absolute) :name "FOO" :case :common) --> #P"/foo" (make-pathname (pathname-host #P"/") :directory '(:absolute) :name "Foo" :case :common) --> #P"/Foo" (make-pathname "LOGICAL-HOST" :directory '(:absolute) :name "foo" :case :common) --> #P"LOGICAL-HOST:FOO" (make-pathname (pathname-host #P"/") :name "FOO" :case :common) --> #P"LOGICAL-HOST:FOO" (make-pathname (pathname-host #P"/") :name "Foo" :case :common) --> #P"LOGICAL-HOST:FOO" (setf (logical-pathname-translations "LOGICAL-HOST") (list (list "LOGICAL-HOST:*.*" "/tmp/*.*") (list "LOGICAL-HOST:*" "/tmp/*"))) (translate-logical-pathname #P"LOGICAL-HOST:FOO") --> #P"/tmp/foo" ; and definitely not #P"/tmp/FOO" since: 19.3.1.1.7 Lowercase Letters in a Logical Pathname Namestring When parsing words and wildcard-words, lowercase letters are translated to uppercase. #P"LOGICAL-HOST:foo" : should return --> #P"LOGICAL-HOST:FOO" and not #P"LOGICAL-HOST:foo" as it does in ccl (and a few other implementations). People are complaining often about logical pathname, but the only problem there is with them, is that they're almost never implemented as specified… One last note however: on unix you can mount foreign file systems, such as MS-DOS FAT, MS-Windows, Mac HFS+, etc (you can even mount file systems in file systems, so a given path can contain components that designates elements in multiple different file systems with different conventions and "customary case"! This means that converting: #P"ROOT:ON-EXT2;HFS-MP;ON-HFS;DOS-MP;ON-DOS;EXT4-MP;ON-EXT4;FILE.TXT" using the default mapping implied by customary case of each component would give a physical pathname such as: #P"/on-ext2/hfs-mp/On-Hfs/Dos-Mp/ON-DOS/EXT4-MP/on-ext4/file.txt" Also, file systems can be mounted and unmounted dynamically, so another translations at another time could give a different rule. And we you also have to translate pathnames to inexistant files, and therefore use the customary case of the last file system seen for the inexistant parts). (There's no such #P"/tmp/FOO" file on my file system, but #P"/tmp/" exists and has the lower customary case, so #P"LOGICAL-HOST:FOO" could still be translated to #P"/tmp/foo"). And finally, it should be noted that most implementation provide hooks to let the user specify the mapping between module names and modules and therefore even if you provide a sane mapping and implementation here, nothing would prevent the user to implement a case insensitive mappting or something stranger. But he would control it, so up to him. https://groups.google.com/forum/#!search/comp.lang.lisp$20pjb$20logical$20$20pathnames$20implementations/comp.lang.lisp/kA5KT4WNF8k/1CSM8Hla6xAJ -- __Pascal Bourguignon__ http://www.informatimago.com/ “The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.” -- Carl Bass CEO Autodesk |