Compact sequences have extra indentation
ruamel.yaml is a YAML 1.2 parser/emitter for Python
Brought to you by:
anthon
It seems that compact sequences are written with extra indentation. I encountered when using yamllint (https://github.com/adrienverge/yamllint) with files processed by ruamel.yaml.
I was able to reproduce the problem using a simple unit test (I used one from test_spec_examples.py):
def test_example_2_3_3(): yaml = YAML() yaml.indent(sequence=4, offset=2) yaml.round_trip(""" american: - - Boston Red Sox - Detroit Tigers - New York Yankees - New York Mets - Chicago Cubs - Atlanta Braves """)
It failed for me with the following message:
--- input string +++ round trip YAML @@ -1,7 +1,7 @@ american: - - - Boston Red Sox - - Detroit Tigers - - New York Yankees + - - Boston Red Sox + - Detroit Tigers + - New York Yankees - New York Mets - Chicago Cubs - Atlanta Braves
Looking at YAML spec (https://yaml.org/spec/1.2/spec.html#id2797382) it looks like the indentation is not necessary, but I'm not sure.
There is no such thing specified in YAML 1.2 as "compact" sequences. There is block style and flow style and this is a block style within a block style sequence.
Please note that in ruamel.yaml all sequences will be consistently indented, nested or not, as is documented. Indentation is never preserved, it is made consistent.
What you show as input is some inconsistent mixture of indentation and that is of course not supported. I see no extra indentation, there is 4 indentation for the first level of sequence and four for the second level, so Boston Red Sox starts at column 8 (counting from 0). Exactly as you specified.
Thanks for taking a look.
Quote from the spec:
This means that the first "-" with the following space in the line with "Boston Red Sox" should be considered as part of the block indendation. This is also illustrated by example 8.15 which indicates the whole block:
Last edit: Sergey Tyurin 2020-04-09
So with compact sequence you seem to mean compact notation of a block sequence. Somehow I cannot find your quote in the spec though.
However the output of 8.15 is consisten (as ruamel.yaml defines it). The root level sequence has an indent of 2 with an offset for the sequence indicator of 0, and the nested block sequence idem. You can get that using ruamel.yaml using
yaml.indent(sequence=2, offset=0)
(i.e. the default).Please note that example is not incorrect YAML, it can be parsed, but so can
abc: - x - y def: - k - l
And that is essentially what you do: some block sequences are indented further than others.
Here's the part of the spec with text I quoted, it's in "8.2.1. Block Sequences" (https://yaml.org/spec/1.2/spec.html#id2797382):
As you can see on this screenshot, the example with "Compact sequence" illustrates this behavior where the block includes the first hypen:
So the block starts at column 0. It's indicated by the color, it starts from column 0 on the second line. On the first line the first hypen is not is part of the block, but part of the indendation.
For example, for indendation of 4 whitespaces it would be: