#524 parse-namestring doesn't like #\CYRILLIC_CAPITAL_LETTER_IE

lisp error
closed-fixed
Sam Steingold
clisp (525)
5
2009-06-17
2009-06-15
Timofei Shatrov
No

If an argument to parse-name-string contains #\CYRILLIC_CAPITAL_LETTER_IE, it results in an error.

> (parse-namestring "Е")

PARSE-NAMESTRING: syntax error in filename "Е" at position 0.

I'm not sure if only this particular letter is affected or there are others, mostly CLISP processes files with Cyrillic letters just fine, but not when they contain this letter apparently.

Discussion

  • Sam Steingold
    Sam Steingold
    2009-06-15

    I do not observe this with the cvs head on linux.
    what are you using?
    did you try other characters?
    if this is on win32, the cause might be that the single letter is interpreted as a drive name and that could be a problem.

     
  • Sam Steingold
    Sam Steingold
    2009-06-15

    • assigned_to: haible --> sds
     
  • Yes, this is on Win32, using CLISP 2.47. It errors on every string containing this letter, it doesn't have to be the first letter. When full path is used (with drive and all) it also errors. I haven't encountered any other letters that cause it to fail yet.

     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    This bug report is now marked as "pending"/"invalid".
    This means that we think that the problem you report
    is not a problem with CLISP.
    Unless you - the reporter - act within 2 weeks,
    the bug will be permanently closed.
    Sorry about the inconvenience -
    we hope your silence means that
    you agree that this is not a bug in CLISP.

     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    • status: open --> pending-invalid
     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    OK, I can reproduce this on win32.
    this boils down to #\CYRILLIC_CAPITAL_LETTER_IE being bytes 208 and 149 (int utf-8)
    and byte 149 not being a valid byte for a win32 pathname (208 is legal).
    take a look at your config.h:

    /* expression in ch which is true if ch is a valid character in filenames */
    #define VALID_FILENAME_CHAR (ch >= 1) && (ch <= 127) && (ch != 34) && (ch != 42)
    && (ch != 47) && (ch != 58) && (ch != 60) && (ch != 62) && (ch != 63) && (ch !=
    92) || (ch == 131) || (ch >= 160) && (ch != 197) && (ch != 206)

    so, the remedy is to use *pathname-encoding* which will correspond to the encoding used by win32.
    see http://clisp.podval.org/impnotes/faq.html#faq-enc-err
    http://clisp.podval.org/impnotes/encoding.html#path-enc

     
    • status: pending-invalid --> open-invalid
     
  • But my *pathname-encoding* is not utf-8.

    CL-USER> custom:*pathname-encoding*
    #<ENCODING CHARSET:CP1251 :DOS>

    I think "Е" has character code 197 in that encoding (at least Wikipedia tells me so). And I haven't seen any other programs complaining about that file's name, so it must be valid.

     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    • status: open-invalid --> pending-invalid
     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    byte 197 is illegal in file names either.
    look at VALID_FILENAME_CHAR in my comment [2009-06-16 11:47]: it is specifically excluded.
    see clisp/src/m4/filecharset.m4.
    are you saying that you have an actual file on disk whose name contains this character?

     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    interesting: I did manage to create files named with both 197 and 149.
    I wonder what the correct text on win32 is.

     
  • Sam Steingold
    Sam Steingold
    2009-06-16

    • status: pending-invalid --> pending
     
  • 1. no side effects
    2. no problem removing the test directory
    3. conftest.out:
    ((ch >= 32) && (ch <= 61) && (ch != 34) && (ch != 42) && (ch != 47) && (ch != 58) && (ch != 60)) || ((ch >= 64) && (ch != 92) && (ch != 124) && (ch != 162) && (ch != 179) && (ch != 190))
    4. uname -a
    CYGWIN_NT-5.2 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin
    windows XP SP3
    5. same results when compiling with and without -mno-cygiwin

     
  • Windows XP SP2, compiled with mingw

    1. no side effects
    2. no problems deleting the directory
    3. on NTFS volume:
    ((ch >= 32) && (ch <= 61) && (ch != 34) && (ch != 42) && (ch != 47) && (ch != 58) && (ch != 60)) || ((ch >= 64) && (ch != 92) && (ch != 124) && (ch != 162) && (ch != 179) && (ch != 190))

    (same as for avodonosov)

    on FAT32 volume:
    ((ch >= 32) && (ch <= 61) && (ch != 34) && (ch != 42) && (ch != 47) && (ch != 58) && (ch != 60)) || ((ch >= 64) && (ch <= 96) && (ch != 92)) || (ch == 123) || ((ch >= 125) && (ch != 146) && (ch != 151) && (ch != 162) && (ch != 179) && (ch != 190))

     
    • status: pending --> open
     
  • Sam Steingold
    Sam Steingold
    2009-06-17

    my own results:

    1. no side effects
    2. no problem removing the test directory
    3. conftest.out:
    ((ch >= 32) && (ch <= 61) && (ch != 34) && (ch != 42) && (ch != 47) && (ch !=
    58) && (ch != 60)) || ((ch >= 64) && (ch != 92) && (ch != 124))
    4. uname -a
    CYGWIN_NT-5.2 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin
    windows server 2003 r2; NTFS
    5. same results when compiling with and without -mno-cygiwin

    it appears that the results depend both on the windows version and on the filesystem type (NTFS vs FAT32)

     
  • Sam Steingold
    Sam Steingold
    2009-06-17

    I enabled the test on all platforms.

     
  • Sam Steingold
    Sam Steingold
    2009-06-17

    • status: open --> closed-fixed
     
  • Sam Steingold
    Sam Steingold
    2009-06-17

    thank you for your bug report.
    the bug has been fixed in the CVS tree.
    you can either wait for the next release (recommended)
    or check out the current CVS tree (see http://clisp.cons.org\)
    and build CLISP from the sources (be advised that between
    releases the CVS tree is very unstable and may not even build
    on your platform).

     
  • Sam Steingold
    Sam Steingold
    2009-07-02

    <http://article.gmane.org/gmane.lisp.clisp.devel/20352>

    with "gcc foo.c":
    1. no side effects
    2. no problem removing the directory
    3. conftest-cygwin.out:
    ((ch >=3D 32) && (ch <=3D 61) && (ch !=3D 34) && (ch !=3D 42) && (ch !=3D 4=
    7) &&
    (ch !=3D 58) && (ch !=3D 60)) || ((ch >=3D 64) && (ch !=3D 92) && (ch !=3D =
    124))
    4.
    $ uname -a
    CYGWIN_NT-5.1 blackthorn 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin
    $ gcc --version
    gcc (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)

    with "gcc foo.c -mno-cygwin"
    1. one file left over in the test directory:
    $ ls -l conftestdir/
    total 0
    -rwxr-xr-x 1 Elliott Slaughter None 0 Jul 1 23:33 a?z
    Note: Windows explorer lists the name as "a=FFz".
    This result seems to be reproducible.
    2. no problem removing the directory
    3. conftest-nocygwin.out:
    ((ch >=3D 32) && (ch <=3D 61) && (ch !=3D 34) && (ch !=3D 42) && (ch !=3D 4=
    7) &&
    (ch !=3D 58) && (ch !=3D 60)) || ((ch >=3D 64) && (ch !=3D 92) && (ch !=3D =
    124))
    4. (same system as above)

     
  • Sam Steingold
    Sam Steingold
    2009-07-02

    <http://article.gmane.org/gmane.lisp.clisp.devel/20353>

    cygwin-1.5:
    ((ch >= 32) && (ch <= 61) && (ch != 34) && (ch != 42) && (ch != 47) &&
    (ch != 58) && (ch != 60)) || ((ch >= 64) && (ch != 92) && (ch != 124))

    cygwin-1.7:
    ((ch >= 32) && (ch <= 191) && (ch != 47) && (ch != 92)) || (ch >= 248)

    detect cygwin-1.7:
    #include <cygwin/version.h>
    #if (CYGWIN_VERSION_API_MINOR >= 181)
    #assert cygwin 1.7
    #endif

    --
    Reini Urban
    http://phpwiki.org/ http://murbreak.at/