|
From: Adam T. <aa-...@us...> - 2024-08-07 02:25:44
|
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Wed Aug 07, 2024 02:25 AM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-07 12:22:37
|
Unfortunately, I don't know what the "test failure on Windows" actually is, could you provide the relevant test output, please?
> Is it possible to use the simpler unquote() here?
If it works on Windows, using unquote() is a good idea. Could you test with
~~~
diff --git a/docutils/docutils/writers/_html_base.py b/docutils/docutils/writers/_html_base.py
index ead930e31..2eab36510 100644
--- a/docutils/docutils/writers/_html_base.py
+++ b/docutils/docutils/writers/_html_base.py
@@ -626,7 +626,7 @@ def uri2imagepath(self, uri):
uri_parts = urllib.parse.urlparse(uri)
if uri_parts.scheme not in ('', 'file'):
raise ValueError('Can only read local images.')
- imagepath = urllib.request.url2pathname(uri_parts.path)
+ imagepath = urllib.parse.unquote(uri_parts.path)
if imagepath.startswith('/'): # cf. config.html#root-prefix
root_prefix = Path(self.settings.root_prefix)
imagepath = (root_prefix/imagepath[1:]).as_posix()
~~~
and adapt the test script if required?
> Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
Per definitionem, there is no way to provide a system file-path as image ressource, both [rST](https://docutils.sourceforge.io/docs/ref/rst/directives.html#image) and the [Docutils Doctree](https://docutils.sourceforge.io/docs/ref/doctree.html#image) explicitely expect an URI.
As URIs are capable of representing local file paths, I dont see the advantage.
Interpreting an image URI with undefined scheme as system path would be a complication of both, specification and implementation and a backwards-incompatible change. We would als have to think about images that are only read to get the size but kept external in HTML output.
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Wed Aug 07, 2024 02:25 AM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Adam T. <aa-...@us...> - 2024-08-07 13:12:17
|
> could you provide the relevant test output, please? See https://github.com/AA-Turner/docutils/actions/runs/10284219635/job/28459783099 for an online view (I advise collapsing the "alltests" tab and scrolling up to the pytest one, which has better formatted output). The full failure is very verbose, but it is because `if imagepath.startswith('/')` fails. Using ``unquote()`` does make the current tests pass on Windows, my question was more about what is the range of URIs that we deem valid? If we use ``unquote()``, then e.g. ``file:///C|users/john/mallard.png`` will be improperly translated: ```pycon >>> from urllib.parse import urlparse, unquote >>> from urllib.request import url2pathname >>> print(url2pathname(urlparse('file:///C|users/john/mallard.png').path)) C:\users\john\mallard.png >>> print(unquote(urlparse('file:///C|users/john/mallard.png').path)) /C|users/john/mallard.png ``` Now it seems that (at least [from wikipedia](https://en.wikipedia.org/wiki/File_URI_scheme)) this scheme of using `|` is not standardised/used, and might just be a Python quirk -- in which case I think just using ``unquote()`` is easier. Of note, ``pathlib`` does not use the ``|`` replacement mechanism. > Per definitionem, there is no way to provide a system file-path as image ressource, both rST and the Docutils Doctree explicitely expect an URI. This seems to be a problem of imprecise language. ``picture.png`` as used in the example is not a URI, as it has no scheme. Nor is ``/data/blue square.png`` or etc. Indeed, none of the ``.. image::`` directives in ``docs/`` have a scheme, and hence aren't URIs. The actual check that we do is in ``Image.run()``, ``reference = directives.uri(self.arguments[0])``, which just removes unescaped whitespace. > Interpreting an image URI with undefined scheme as system path This goes directly to my point, as the scheme is what makes a URI! (well, the scheme and the path). I think that there is a strong argument based on historical practice in favour of documenting that we accept either a URI-with-scheme (`file:` or `https:` etc) or a local file path. However, I might be missing something, and I agree we should avoid backwards-incompatible changes. My main motivation is simply to fix the tests, and it seems that using ``unquote()`` will work, unless you forsee any other problems. A --- **[bugs:#493] Test failure on Windows with embedded images** **Status:** open **Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner **Last Updated:** Wed Aug 07, 2024 12:22 PM UTC **Owner:** nobody xref [r9785], [r9853], [r9855] Dear @milde, Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows. I believe the following demonstrates the problem: ```pycon >>> import sys; print(sys.platform) win32 >>> import urllib.parse, urllib.request >>> urllib.request.url2pathname('test/data/circle-broken.svg') 'test\\data\\circle-broken.svg' >>> urllib.parse.unquote('test/data/circle-broken.svg') 'test/data/circle-broken.svg' ``` Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``). I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here? For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path). A --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-07 14:54:31
|
I am in favour of using unquote(). > Now it seems that (at least from wikipedia) this scheme of using | is not standardised/used, and might just be a Python quirk The "|" seems to be not Python specific, however, it is obsolete. Wiki says: > On Microsoft Windows systems, the normal colon (:) after a device letter has sometimes been replaced by a vertical bar (|) in file URLs. This reflected the original URL syntax, which made the colon a reserved character in a path part. Since Internet Explorer 4, file URIs have been standardized on Windows, and should follow the following scheme. URI vs. Path > This seems to be a problem of imprecise language. picture.png as used in the example is not a URI, as it has no scheme. > ... > we accept either a URI-with-scheme (file: or https: etc) or a local file path. Actually not: by default, the "uri" attribute is written as-is to the HTML `<img>` element's "src" attribute which expects an URL and may fail for a number of valid system paths. It seems what Docutils expects is a [URI-Reference](https://datatracker.ietf.org/doc/html/rfc3986#section-4.1): > A URI-reference is either a URI or a relative reference. > If the URI-reference's prefix does not match the syntax of a scheme followed by its colon separator, then the URI-reference is a [relative reference](https://datatracker.ietf.org/doc/html/rfc3986#section-4.2). In contrast to a system file path, a relative reference still requires forward slashes and quoting of some characters. --- **[bugs:#493] Test failure on Windows with embedded images** **Status:** open **Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner **Last Updated:** Wed Aug 07, 2024 01:12 PM UTC **Owner:** nobody xref [r9785], [r9853], [r9855] Dear @milde, Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows. I believe the following demonstrates the problem: ```pycon >>> import sys; print(sys.platform) win32 >>> import urllib.parse, urllib.request >>> urllib.request.url2pathname('test/data/circle-broken.svg') 'test\\data\\circle-broken.svg' >>> urllib.parse.unquote('test/data/circle-broken.svg') 'test/data/circle-broken.svg' ``` Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``). I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here? For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path). A --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-07 23:37:10
|
- **Comment**:
Fixed in [r9866]. Documentation updated in [r9865].
Thank you for report and analysis.
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Wed Aug 07, 2024 02:54 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-08 16:24:09
|
- **status**: open --> open-fixed
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open-fixed
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Wed Aug 07, 2024 11:37 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Adam T. <aa-...@us...> - 2024-08-08 20:38:19
|
- **status**: open-fixed --> open
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Thu Aug 08, 2024 04:24 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Adam T. <aa-...@us...> - 2024-08-08 20:43:19
|
Sorry, another failure.
The root cause of this one seems to be the `uri_parts.scheme not in ('', 'file')` check. On Windows, we get the following, because drives in absolute paths are parsed as URI schemes:
```pycon
>>> uri
'S:/Development/docutils/docutils/test/data/circle-broken.svg'
>>> urllib.parse.urlparse(uri).scheme
's'
```
We could do something like ``Path(uri).is_file()``, but that has the problems with using pathlib that you mentioned earlier...
Full failure below, also on https://github.com/AA-Turner/docutils/actions/runs/10308701375/job/28536730749
```
============================================================================================================= FAILURES ==============================================================================================================
_________________________________________________________________________ Html5WriterPublishPartsTestCase.test_publish (id="totest['system_messages'][1]") __________________________________________________________________________
self = <test.test_writers.test_html5_polyglot_parts.Html5WriterPublishPartsTestCase testMethod=test_publish>
def test_publish(self):
if not with_pygments:
del totest['syntax_highlight']
writer = 'html5'
for name, (settings_overrides, cases) in totest.items():
for casenum, (case_input, case_expected) in enumerate(cases):
with self.subTest(id=f'totest[{name!r}][{casenum}]'):
parts = docutils.core.publish_parts(
source=case_input,
writer=writer,
settings_overrides={
'_disable_config': True,
'strict_visitor': True,
'stylesheet': '',
'section_self_link': True,
**settings_overrides,
}
)
> self.assertEqual(case_expected, self.format_output(parts))
E AssertionError: {'fragment': '<svg xmlns="http://www.w3.org/2000/svg"\n [410 chars]>\n'} != {'fragment': '<img alt="S:/Development/docutils/docutils[396 chars]>\n'}
E + {'fragment': '<img '
E + 'alt="S:/Development/docutils/docutils/test/data/circle-broken.svg" '
E + 'src="S:/Development/docutils/docutils/test/data/circle-broken.svg" '
E - {'fragment': '<svg xmlns="http://www.w3.org/2000/svg"\n'
E - ' viewBox="0 0 10 10">\n'
E - ' <circle cx="5" cy="5" r="4" fill="lightblue" x/>\n'
E - '</svg>\n'
E - '\n'
E + '/>\n'
E ? ++
E
E '<aside class="system-message">\n'
E '<p class="system-message-title">System Message: ERROR/3 (<span '
E 'class="docutils literal"><string></span>, line 1)</p>\n'
E - '<p>Cannot parse SVG image '
E ? ---- ^^^^
E
E + '<p>Cannot embed image '
E ? ^^^^
E
E '"S:/Development/docutils/docutils/test/data/circle-broken.svg":\n'
E - ' not well-formed (invalid token): line 3, column 48</p>\n'
E + ' Can only read local images.</p>\n'
E '</aside>\n'}
docutils\test\test_writers\test_html5_polyglot_parts.py:92: AssertionError
________________________________________________________________________ Html5WriterPublishPartsTestCase.test_publish (id="totest['no_system_messages'][1]") ________________________________________________________________________
self = <test.test_writers.test_html5_polyglot_parts.Html5WriterPublishPartsTestCase testMethod=test_publish>
def test_publish(self):
if not with_pygments:
del totest['syntax_highlight']
writer = 'html5'
for name, (settings_overrides, cases) in totest.items():
for casenum, (case_input, case_expected) in enumerate(cases):
with self.subTest(id=f'totest[{name!r}][{casenum}]'):
parts = docutils.core.publish_parts(
source=case_input,
writer=writer,
settings_overrides={
'_disable_config': True,
'strict_visitor': True,
'stylesheet': '',
'section_self_link': True,
**settings_overrides,
}
)
> self.assertEqual(case_expected, self.format_output(parts))
E AssertionError: {'fragment': '<svg xmlns="http://www.w3.org/2000/svg"\n [85 chars]n\n'} != {'fragment': '<img alt="S:/Development/docutils/docutils[98 chars]>\n'}
E + {'fragment': '<img '
E + 'alt="S:/Development/docutils/docutils/test/data/circle-broken.svg" '
E + 'src="S:/Development/docutils/docutils/test/data/circle-broken.svg" '
E - {'fragment': '<svg xmlns="http://www.w3.org/2000/svg"\n'
E - ' viewBox="0 0 10 10">\n'
E - ' <circle cx="5" cy="5" r="4" fill="lightblue" x/>\n'
E - '</svg>\n'
E - '\n'}
E + '/>\n'}
E ? ++
docutils\test\test_writers\test_html5_polyglot_parts.py:92: AssertionError
```
A
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Thu Aug 08, 2024 08:38 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-09 08:45:22
|
The parsing is correct: The image argument is a "URI reference":
> A URI-reference is either a URI or a relative reference. If the
URI-reference's prefix does not match the syntax of a scheme followed
by its colon separator, then the URI-reference is a relative
reference. (RFC 3986)
Hence, the example
~~~
.. image:: S:/Development/data/circle-broken.svg
~~~
describes an image accessible under scheme "S".
Our problem is, that we cannot easily use relative paths (to keep flexibility regarding where to run the tests from).
A simple solution would be using a "file" scheme in the offending examples:
~~~
diff --git a/docutils/test/test_writers/test_html5_polyglot_parts.py b/docutils/test/test_writers/test_html5_polyglot_parts.py
index ba139d65c..102273234 100755
--- a/docutils/test/test_writers/test_html5_polyglot_parts.py
+++ b/docutils/test/test_writers/test_html5_polyglot_parts.py
@@ -714,7 +714,7 @@ def format_output(self, parts):
""",
}],
[f"""\
-.. image:: {DATA_ROOT}/circle-broken.svg
+.. image:: file://{DATA_ROOT}/circle-broken.svg
:loading: embed
""",
{'fragment': f"""\
@@ -725,7 +725,7 @@ def format_output(self, parts):
<aside class="system-message">
<p class="system-message-title">System Message: ERROR/3 (<span class="docutils literal"><string></span>, line 1)</p>
-<p>Cannot parse SVG image "{DATA_ROOT}/circle-broken.svg":
+<p>Cannot parse SVG image "file://{DATA_ROOT}/circle-broken.svg":
not well-formed (invalid token): line 3, column 48</p>
</aside>
"""
@@ -831,7 +831,7 @@ def format_output(self, parts):
""",
}],
[f"""\
-.. image:: {DATA_ROOT}/circle-broken.svg
+.. image:: file://{DATA_ROOT}/circle-broken.svg
:loading: embed
""",
{'fragment': """\
~~~
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Thu Aug 08, 2024 08:43 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Adam T. <aa-...@us...> - 2024-08-09 21:55:30
|
> The parsing is correct: The image argument is a "URI reference" ...
The issue here now becomes why the tests were not failing on Linux, I believe. Both Windows and Linux would have had an absolute path (e.g. `.. image:: /home/user/development/docutils/docutils/test/data/circle-broken.svg`). On Windows, this failed due to misinterpreting the drive letter as a scheme. However, because the default `root_prefix` is `/`, when an absolute path is used on Unix, the scheme is `''`, `imagepath.startswith('/')` is `True`, and `root_prefix/imagepath[1:]` just strips and restores the leading slash.
I believe that if we are only accepting URI-refereneces, then we should reject absolute paths on UNIX, and the tests should fail.
`--image-loading` was added in Docutils 0.18 and `:loading:` was added in 0.21. Local testing shows that on Linux, absolute paths are allowed with `--image-loading embed` or `:loading: embed` in every released version and the current master branch. On Windows, 0.18-0.20 allowed absolute paths, but since 0.21 an error has been raised.
The following table demonstrates the matrix. `image_loading` means using `--image-loading embed` and `:loading:` means using `:loading: embed`. My test commands were:
```console
$ printf ".. image:: /mnt/s/Development/docutils/docutils/test/data/circle.svg" | rst2html5 --link-stylesheet --image-loading embed
$ printf ".. image:: /mnt/s/Development/docutils/docutils/test/data/circle.svg\n :loading: embed" | rst2html5 --link-stylesheet
PS> echo ".. image:: S:/Development/docutils/docutils/test/data/circle.svg" | python rst2html5.py --link-stylesheet --image-loading embed
PS> echo ".. image:: S:/Development/docutils/docutils/test/data/circle.svg`n :loading: embed" | python rst2html5.py --link-stylesheet
```
*Absolute path support in the image directive with embed mode*
| | **Linux** | **Linux** | **Windows** | **Windows** |
|------|---------------|-----------|---------------|-------------|
| | `image_loading` | `:loading:` | `image_loading` | `:loading:` |
| 0.18 | base64 | N/A | base64 | N/A |
| 0.19 | base64 | N/A | base64 | N/A |
| 0.20 | base64 | N/A | base64 | N/A |
| 0.21 | direct | direct | error | error |
| 0.22 | direct | direct | error | error |
I think we have two options. (1) support absolute paths on every platform, document as such. (2) make absolute paths return an error on Linux. The first restores backwards compatibility, the second is consistent with what we document.
> [r9880]
A nit-pick here -- this commit leads to URIs like `file://s:/...` on Windows. From Wikipedia's page on file URIs "*`file://path` (i.e. two slashes, without a hostname) is never correct, but is often used.*".
I think we should use [Path.as_uri()](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.as_uri) instead of `.as_posix()`.
A
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Fri Aug 09, 2024 08:45 AM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-11 07:33:25
|
> I think we should use Path.as_uri() instead of .as_posix().
Done in [r9882].
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Sat Aug 10, 2024 11:04 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-10 08:29:24
|
> I think we have two options.
> (1) support absolute paths on every platform, document as such.
> (2) make absolute paths return an error on Linux.
(3) support URI-references: URI | absolute (POSIX) path | relative (POSIX) path
>The first restores backwards compatibility,
the second is needlessly restrictive and hard to explain.
The third is consistent with what we document.
It is also in line with "root-prefix" only being used if a path starts with "/".
The different handling of Windows system paths is due to the syntax definition of URIs where the "path" part is represented by a POSIX path.
---
> I think we should use Path.as_uri() instead of .as_posix().
I prepared a patch.
BTW: the docs for .as_posix() write:
~~~
>>> p = PureWindowsPath('c:/Windows')
>>> p.as_uri()
'file:///c:/Windows'
~~~
Then `/c:/Windows` may be usable as both, absolute file system path under Windows and "absolute-path reference" [1](https://www.rfc-editor.org/rfc/rfc3986.html#section-4.2) in a URI-reference like:
~~~
.. image:: /E:/DCIM/parrot.jpg
~~~
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Fri Aug 09, 2024 09:55 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Adam T. <aa-...@us...> - 2024-08-10 14:09:33
|
> the second is needlessly restrictive and hard to explain. Suppose a user has a directive ``.. image:: /media/circle.svg``. By default this will load the file under `/media/circle.svg`, as the user might expect if they think they are using filesystem paths, and don't realise the nuances of URI references. At some later date, this user sets `root_prefix` to `/home/user/project/`. The image directive will stop working, as it now looks for a file under `/home/user/project/media/circle.svg`. I think this behaviour is hard to explain and quite surprising. I have a proposed fix at https://github.com/AA-Turner/docutils/pull/18 (https://github.com/AA-Turner/docutils/pull/18.patch). This does the following: * Adds a helper function to convert image URIs to paths * Updates all writers (HTML, LaTeX, and ODF/ODT) that support images to use this helper for consistent behaviour * Correctly convert `file:///` URIs to filesystem paths (using Python 3.13's [Path.from_uri()](https://docs.python.org/3.13/library/pathlib.html#pathlib.Path.from_uri) where possible). * Validate that `file:` URIs are absolute paths * Correctly convert relative references to filesystem paths * Changes the default `root_prefix` to `''` (the empty string), and only applies the `root_prefix` behaviour when it is non-empty (i.e. a user has configured the setting) The second commit in the patch goes further, by rejecting absolute paths when used in relative references. This is optional, but as I argue above I think allowing absolute paths in relative references leads to surprising behaviour, and the resolution is easy if you want to keep using absolute paths (just use `file:///`). I think this would be a straightforward end-state to explain to users: either the argument to `.. image::` (and `.. figure::`) is a URI or it is a relative filesystem path. A --- **[bugs:#493] Test failure on Windows with embedded images** **Status:** open **Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner **Last Updated:** Sat Aug 10, 2024 08:29 AM UTC **Owner:** nobody xref [r9785], [r9853], [r9855] Dear @milde, Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows. I believe the following demonstrates the problem: ```pycon >>> import sys; print(sys.platform) win32 >>> import urllib.parse, urllib.request >>> urllib.request.url2pathname('test/data/circle-broken.svg') 'test\\data\\circle-broken.svg' >>> urllib.parse.unquote('test/data/circle-broken.svg') 'test/data/circle-broken.svg' ``` Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``). I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here? For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path). A --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-12-04 15:48:46
|
A helper function that converts a URI-reference to a filesystem path for HTML and LaTeX writers based on this patch is introduced in [r9996]. The ODT writer will have to wait, as this is a backwards-incompatible change: : The ODT writer expects relative image paths to be relative to the *source* directory.
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** pending-works-for-me
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Mon Nov 25, 2024 09:08 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-10 14:48:42
|
Your analysis with "the `uri_parts.scheme not in ('', 'file')` check" beeing "the root cause" of the test failure is correct. This check was introduced in [r9501] "Refactor "_html_base" writer."
However, even before (since introduction of image embedding in 0.18), the drive part was split from the path. To verify this, try `PS> echo ".. image:: S:/Development/docutils/docutils/test/data/circle.svg" | python rst2html5.py --link-stylesheet --image-loading embed` from a directory in a different drive.
BTW: both, the introduction of the :loading: directive option and of direct embedding of SVG images did not change path handling (you should get the same behaviour with a PNG or JPG image).
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Sat Aug 10, 2024 02:09 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-10 23:04:26
|
Another example why we should make it clear that the image argument is an "URI reference": ~~~ .. image:: My Documents/first drawing.png ~~~ is parsed to ~~~ .. image:: My Documents/first drawing.png ~~~ because [spaces are dropped from link blocks and image-URIs](https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#uri-context) to allow easy breaking of long URIs over several lines. This behaviour is surprising when you consider an image attribute without scheme a path. --- **[bugs:#493] Test failure on Windows with embedded images** **Status:** open **Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner **Last Updated:** Sat Aug 10, 2024 02:48 PM UTC **Owner:** nobody xref [r9785], [r9853], [r9855] Dear @milde, Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows. I believe the following demonstrates the problem: ```pycon >>> import sys; print(sys.platform) win32 >>> import urllib.parse, urllib.request >>> urllib.request.url2pathname('test/data/circle-broken.svg') 'test\\data\\circle-broken.svg' >>> urllib.parse.unquote('test/data/circle-broken.svg') 'test/data/circle-broken.svg' ``` Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``). I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here? For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path). A --- Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-11 22:44:34
|
> The second commit in the patch goes further, by rejecting absolute paths when used in relative references. This is optional, but as I argue above I think allowing absolute paths in relative references leads to surprising behaviour the resolution is easy if you want to keep using absolute paths (just use file:///).
Using file does not work if you want to generate HTML that should be served via https and
* create two output versions with/without embedding images from the same source, or
* read the image size from the file but not embed the image:
~~~
.. image:: /pictures/parrot.jpg
:scale: 100
~~~
If the http server looks for files in the dir "/public/wwwfiles/", say, you can set this dir as "root-prefix".
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Sun Aug 11, 2024 07:33 AM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-15 20:51:43
|
@aa-turner: Could you create a patch that just
> Changes the default `root_prefix` to `''` (the empty string), and only applies the `root_prefix` behaviour when it is non-empty (i.e. a user has configured the setting)
I agree that this is easier to explain (while the current implementation does not need a conditional). As the change is also backwards-compatible we can implement it without advance warning.
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Sun Aug 11, 2024 10:44 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-11-25 21:08:29
|
This part of the patch is applied in [r9991].
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** pending-works-for-me
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Tue Oct 15, 2024 09:44 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-08-30 11:40:03
|
@aa-turner: Do the tests still fail on Windows?
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Thu Aug 15, 2024 08:51 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2024-10-15 21:44:19
|
- **status**: open --> pending-works-for-me
- **Comment**:
@aa-turner: Can we close this bug?
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** pending-works-for-me
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Fri Aug 30, 2024 11:39 AM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2025-04-14 13:10:48
|
- **status**: pending-works-for-me --> open-fixed
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open-fixed
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Wed Dec 04, 2024 03:48 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2025-07-30 09:11:29
|
@aa-turner: Can we close this bug? Is there still something to do after 0.22 is out?
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** open-fixed
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Mon Apr 14, 2025 01:10 PM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |
|
From: Günter M. <mi...@us...> - 2025-09-18 07:56:47
|
- **status**: open-fixed --> closed-fixed
---
**[bugs:#493] Test failure on Windows with embedded images**
**Status:** closed-fixed
**Created:** Wed Aug 07, 2024 02:25 AM UTC by Adam Turner
**Last Updated:** Wed Jul 30, 2025 09:11 AM UTC
**Owner:** nobody
xref [r9785], [r9853], [r9855]
Dear @milde,
Thank you for the fix to my recent patch. It seems neither my patch nor the fix addressed the root cause of the test failures, as tests have resumed failing on Windows.
I believe the following demonstrates the problem:
```pycon
>>> import sys; print(sys.platform)
win32
>>> import urllib.parse, urllib.request
>>> urllib.request.url2pathname('test/data/circle-broken.svg')
'test\\data\\circle-broken.svg'
>>> urllib.parse.unquote('test/data/circle-broken.svg')
'test/data/circle-broken.svg'
```
Currently, we use `imagepath = urllib.request.url2pathname(uri_parts.path)`, which converts path separators to their platform-native format. On UNIX, `url2pathname` simply calls `unquote`, but on Windows it handles UNC paths (``\\host\path\``) and escaped drive letters (``///C|/users/``).
I don't know what led to using `url2pathname()`, as it is quite specialised (the docstring notes "not recommended for general use"). Is it possible to use the simpler `unquote()` here?
For local file paths (e.g. without a ``file:///`` scheme), should we even be using URI parsing? Perhaps we should use proper path handling if there is no URI scheme (i.e. the user has provided a file-path).
A
---
Sent from sourceforge.net because doc...@li... is subscribed to https://sourceforge.net/p/docutils/bugs/
To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/docutils/admin/bugs/options. Or, if this is a mailing list, you can unsubscribe from the mailing list. |