Menu

#315 Filenames with space and non-ASCII characters cause trouble

closed-works-for-me
nobody
None
5
2020-03-03
2017-03-31
Getreu
No

Filenames with space and non-ASCII characters cause trouble when used in the ..include or ..image directive.

Discussion

  • Getreu

    Getreu - 2017-03-31

    This bug feels like a flash back to the 90ies ;)

     
    • David Goodger

      David Goodger - 2017-03-31

      BTW, snide comments are never appreciated. They only increase the likelihood of your bug report being ignored.

       
  • David Goodger

    David Goodger - 2017-03-31
    • status: open --> pending-remind
     
  • David Goodger

    David Goodger - 2017-03-31

    Please provide some evidence, such as a minimal example.

    http://docutils.sourceforge.net/BUGS.html#how-to-report-a-bug

    Exactly what trouble is being caused, and how? Under what Docutils version, Python version, OS?

     
  • Getreu

    Getreu - 2017-03-31

    Consider the following files:
    Both referenced files exisist:

    ****
    tmp8
    ****
    
    .. image:: größeres Bild.jpg
    
    .. include:: schöne Einleitung.rst
    

    Html rendition with error messages:

    SystemMessage
    
    /home/getreu/Desktop/tmp8/20170331-tmp8--Notes.rst:7: (SEVERE/4) Problems with "include" directive path:
    InputError: [Errno 2] No such file or directory: 'sch\xf6ne Einleitung.rst'.
    
    ****
    tmp8
    ****
    
    .. image:: größeres Bild.jpg
    
    .. include:: schöne Einleitung.rst
    

    My python version is:

    rst2html -V
    rst2html (Docutils 0.13.1 [release], Python 2.7.13, on linux2)
    
     
    • David Goodger

      David Goodger - 2017-03-31

      Works for me, using the repository version of Docutils on Python 2.7.12 on Ubuntu Linux 16.04:

      rst2html.py x.txt ./x.html

      What is your OS & version?

      What is the encoding of your source file? Please attach all files required to build the example, and specify the command used.

      What is the encoding of the filesystem's filename/directory structure?

      How do spaces enter into this?

       
  • Günter Milde

    Günter Milde - 2017-03-31

    I get a similar error (No such file or directory: 'sch\xf6ne Einleitung.rst'.) when the file does not exist (e.g. because the actual file name is "schöne Einleitung.txt"). However, the \xf6 is only a problem in error reporting - including files with non-ASCII characters and spaces works (if the document encoding and the file system encoding match).

    The "image" directive expects an URI, not a file name.
    With

    .. image:: größeres%20Bild.jpg
    

    it works as expected.

     
  • Getreu

    Getreu - 2017-04-01
    .. image:: größeres%20Bild.jpg
    
    .. include:: schöne Einleitung.rst
    

    The above finally works. Thank you guys for your help! I was confused because it does not work rst2pdf and sphinx the same way.

    About URI: URI can be encoded in % encoding but it does not have to be! Especially not when Unicode is available. I suggest to allowing the following syntax here:

    .. image:: größeres Bild.jpg
    
    .. include:: schöne Einleitung.rst
    

    BTW: Please note that größeres%20Bild.jpg is already not in % encoding: RFC 3986 section 2.3 states that unreserved characters (January 2005) only these ones: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 - _ . ~ Other characters in a URI must be percent encoded and here you already allow "ö" and "ß" (which I apprecheate). So why not allowing " " also?

    Besides, URI or not, I think % encoding has no place in a modern markup language nowadays.

    Regards

     
    • David Goodger

      David Goodger - 2017-04-01

      Use this syntax:

      .. image:: größeres\ Bild.jpg
      

      Bare (unescaped) spaces in URIs will not be supported in reST because they are inherently ambiguous.

      The fact that the "include" directive works with spaces in the path, without backslashes escaping the spacees, may just be an oversight, a happy accident. "include" is defined as requiring a filesystem path, whereas "image" requires a URI. We may revisit how "include" handles spaces in future.

      I'm going to close this bug report.

       
      • Getreu

        Getreu - 2017-04-02

        David,

        your solution does not work for me! .. image:: größeres%20Bild.jpg is the only way I could access the image.

        Bare (unescaped) spaces in URIs will not be supported in reST because they are inherently ambiguous.

        Escaped spaces are still better than having % encoding in a markup language! Unfortunately \ does not work as documented (I tested again to be sure).

        Please test yourself and reopen this bug report.

        Please also consider allowing spaces in filen pathames. The escaped notation can be kept in parallel for ambiguous cases.

         

        Last edit: Getreu 2017-04-02
        • David Goodger

          David Goodger - 2017-04-02

          This feature was added recently, and is not part of a release yet. If you want to use it, either wait for new a release or update from the repository.

          Please also consider allowing spaces in filen pathames. The escaped notation can be kept in parallel for ambiguous cases.

          For consistency, we may eventually implement the following:

          .. include:: path\ with\ spaces.txt
          

          But as it currently works without the backslash escapes (i.e., the reST parser already allows spaces in file paths), it's not a priority.

           
          • Getreu

            Getreu - 2017-04-03

            I found the commit you are referring to: [r8024]. Thank you!

            To improve ergonomics even further is filed feature request #54 .

             

            Related

            Commit: [r8024]


            Last edit: Getreu 2017-04-03
  • David Goodger

    David Goodger - 2017-04-01
    • status: pending-remind --> closed-works-for-me
     

Log in to post a comment.