Roundtrip fails due to escaping error on line break
ruamel.yaml is a YAML 1.2 parser/emitter for Python
Brought to you by:
anthon
Ruamel.yaml seems to incorrectly dump certain strings when preserve_quotes is True and the line breaks happens around backslashes.
from ruamel.yaml import YAML
from ruamel.yaml.compat import StringIO
import sys
doc = """
key: "1234567890\\\\\\\\ a"
"""
print("Data Original:", doc)
yaml = YAML()
yaml.preserve_quotes = True
yaml.width = 10
data_original = yaml.load(doc)
stream = StringIO()
yaml.dump(data_original, stream)
doc_dumped = stream.getvalue()
print("Document Dumped:", doc_dumped)
data_roundtrip = yaml.load(doc_dumped)
print("Data Roundtrip:", data_roundtrip)
assert data_original["key"] == data_roundtrip["key"]
In 0.17.22 an prior this works fine with the following output:
python3 test.py
Data Original:
key: "1234567890\\\\ a"
Document Dumped: key: "1234567890\\\
\\ a"
Data Roundtrip: {'key': '1234567890\\\\ a'}
Starting with 0.17.23 (up to the current 0.18.5) we get the following output:
python3 test.py
Data Original:
key: "1234567890\\\\ a"
Document Dumped: key: "1234567890\\
\\ a"
Data Roundtrip: {'key': '1234567890\\ \\ a'}
Traceback (most recent call last):
File "/Users/kantert/src/pipeline-tools/config-generator/test.py", line 23, in <module>
assert data_original["key"] == data_roundtrip["key"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
Just to expand on this, this problem occurs when unicode control character (e.g. x00-x19) (plus a few other characters including the backslash
\
) occurs at the site of a line break and the next line has at least one whitespace character (" "
). For instance, the following values would trigger the issue:This does not:
As far as I can tell, the problem occurs in
emitter.py:1436-1447
, code handling whether a escaped line break should be inserted. I can't say I understand what exactly that code is attempting to accomplish, but it seems to me that at least an escape\
should be used if no space" "
immediately follows the line break site.Last edit: Peter Van Dyken 2024-02-14