From: Adam T. <aa-...@us...> - 2024-08-07 13:12:17
|
> could you provide the relevant test output, please? See https://github.com/AA-Turner/docutils/actions/runs/10284219635/job/28459783099 for an online view (I advise collapsing the "alltests" tab and scrolling up to the pytest one, which has better formatted output). The full failure is very verbose, but it is because `if imagepath.startswith('/')` fails. Using ``unquote()`` does make the current tests pass on Windows, my question was more about what is the range of URIs that we deem valid? If we use ``unquote()``, then e.g. ``file:///C|users/john/mallard.png`` will be improperly translated: ```pycon >>> from urllib.parse import urlparse, unquote >>> from urllib.request import url2pathname >>> print(url2pathname(urlparse('file:///C|users/john/mallard.png').path)) C:\users\john\mallard.png >>> print(unquote(urlparse('file:///C|users/john/mallard.png').path)) /C|users/john/mallard.png ``` Now it seems that (at least [from wikipedia](https://en.wikipedia.org/wiki/File_URI_scheme)) this scheme of using `|` is not standardised/used, and might just be a Python quirk -- in which case I think just using ``unquote()`` is easier. Of note, ``pathlib`` does not use the ``|`` replacement mechanism. > Per definitionem, there is no way to provide a system file-path as image ressource, both rST and the Docutils Doctree explicitely expect an URI. This seems to be a problem of imprecise language. ``picture.png`` as used in the example is not a URI, as it has no scheme. Nor is ``/data/blue square.png`` or etc. Indeed, none of the ``.. image::`` directives in ``docs/`` have a scheme, and hence aren't URIs. The actual check that we do is in ``Image.run()``, ``reference = directives.uri(self.arguments[0])``, which just removes unescaped whitespace. > Interpreting an image URI with undefined scheme as system path This goes directly to my point, as the scheme is what makes a URI! (well, the scheme and the path). I think that there is a strong argument based on historical practice in favour of documenting that we accept either a URI-with-scheme (`file:` or `https:` etc) or a local file path. However, I might be missing something, and I agree we should avoid backwards-incompatible changes. My main motivation is simply to fix the tests, and it seems that using ``unquote()`` will work, unless you forsee any other problems. A --- **[bugs:#493] Test failure on Windows with embedded images** **Status:** open **Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner **Last Updated:** Wed Aug 07, 2024 12:22 PM UTC **Owner:** nobody xref [r9785], [r9853], [r9855] Dear @milde, Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows. I believe the following demonstrates the problem: ```pycon >>> import sys; print(sys.platform) win32 >>> import urllib.parse, urllib.request >>> urllib.request.url2pathname('test/data/circle-broken.svg') 'test\\data\\circle-broken.svg' >>> urllib.parse.unquote('test/data/circle-broken.svg') 'test/data/circle-broken.svg' ``` Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``). I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here? For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path). A --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |