From: Adam T. <aa-...@us...> - 2024-08-10 14:09:33
|
> the second is needlessly restrictive and hard to explain. Suppose a user has a directive ``.. image:: /media/circle.svg``. By default this will load the file under `/media/circle.svg`, as the user might expect if they think they are using filesystem paths, and don't realise the nuances of URI references. At some later date, this user sets `root_prefix` to `/home/user/project/`. The image directive will stop working, as it now looks for a file under `/home/user/project/media/circle.svg`. I think this behaviour is hard to explain and quite surprising. I have a proposed fix at https://github.com/AA-Turner/docutils/pull/18 (https://github.com/AA-Turner/docutils/pull/18.patch). This does the following: * Adds a helper function to convert image URIs to paths * Updates all writers (HTML, LaTeX, and ODF/ODT) that support images to use this helper for consistent behaviour * Correctly convert `file:///` URIs to filesystem paths (using Python 3.13's [Path.from_uri()](https://docs.python.org/3.13/library/pathlib.html#pathlib.Path.from_uri) where possible). * Validate that `file:` URIs are absolute paths * Correctly convert relative references to filesystem paths * Changes the default `root_prefix` to `''` (the empty string), and only applies the `root_prefix` behaviour when it is non-empty (i.e. a user has configured the setting) The second commit in the patch goes further, by rejecting absolute paths when used in relative references. This is optional, but as I argue above I think allowing absolute paths in relative references leads to surprising behaviour, and the resolution is easy if you want to keep using absolute paths (just use `file:///`). I think this would be a straightforward end-state to explain to users: either the argument to `.. image::` (and `.. figure::`) is a URI or it is a relative filesystem path. A --- **[bugs:#493] Test failure on Windows with embedded images** **Status:** open **Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner **Last Updated:** Sat Aug 10, 2024 08:29 AM UTC **Owner:** nobody xref [r9785], [r9853], [r9855] Dear @milde, Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows. I believe the following demonstrates the problem: ```pycon >>> import sys; print(sys.platform) win32 >>> import urllib.parse, urllib.request >>> urllib.request.url2pathname('test/data/circle-broken.svg') 'test\\data\\circle-broken.svg' >>> urllib.parse.unquote('test/data/circle-broken.svg') 'test/data/circle-broken.svg' ``` Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``). I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here? For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path). A --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |