Menu

#346 Compact sequences have extra indentation

invalid
nobody
None
minor
bug
2020-04-09
2020-04-09
No

It seems that compact sequences are written with extra indentation. I encountered when using yamllint (https://github.com/adrienverge/yamllint) with files processed by ruamel.yaml.

I was able to reproduce the problem using a simple unit test (I used one from test_spec_examples.py):

def test_example_2_3_3():
    yaml = YAML()
    yaml.indent(sequence=4, offset=2)
    yaml.round_trip("""
    american:
      - - Boston Red Sox
        - Detroit Tigers
        - New York Yankees
      - New York Mets
      - Chicago Cubs
      - Atlanta Braves
    """)

It failed for me with the following message:

--- input string
+++ round trip YAML
@@ -1,7 +1,7 @@
 american:
-  - - Boston Red Sox
-    - Detroit Tigers
-    - New York Yankees
+  -   - Boston Red Sox
+      - Detroit Tigers
+      - New York Yankees
   - New York Mets
   - Chicago Cubs
   - Atlanta Braves

Looking at YAML spec (https://yaml.org/spec/1.2/spec.html#id2797382) it looks like the indentation is not necessary, but I'm not sure.

Discussion

  • Anthon van der Neut

    • status: open --> invalid
     
  • Anthon van der Neut

    There is no such thing specified in YAML 1.2 as "compact" sequences. There is block style and flow style and this is a block style within a block style sequence.
    Please note that in ruamel.yaml all sequences will be consistently indented, nested or not, as is documented. Indentation is never preserved, it is made consistent.

    What you show as input is some inconsistent mixture of indentation and that is of course not supported. I see no extra indentation, there is 4 indentation for the first level of sequence and four for the second level, so Boston Red Sox starts at column 8 (counting from 0). Exactly as you specified.

     
  • Sergey Tyurin

    Sergey Tyurin - 2020-04-09

    Thanks for taking a look.

    Quote from the spec:

    The compact notation may be used when the entry is itself a nested block collection. In this case, both the “-” indicator and the following spaces are considered to be part of the indentation of the nested collection.

    This means that the first "-" with the following space in the line with "Boston Red Sox" should be considered as part of the block indendation. This is also illustrated by example 8.15 which indicates the whole block:

    -·- one # Compact
    ··- two # sequence
    
     

    Last edit: Sergey Tyurin 2020-04-09
  • Anthon van der Neut

    So with compact sequence you seem to mean compact notation of a block sequence. Somehow I cannot find your quote in the spec though.

    However the output of 8.15 is consisten (as ruamel.yaml defines it). The root level sequence has an indent of 2 with an offset for the sequence indicator of 0, and the nested block sequence idem. You can get that using ruamel.yaml using yaml.indent(sequence=2, offset=0) (i.e. the default).
    Please note that example is not incorrect YAML, it can be parsed, but so can
    abc: - x - y def: - k - l
    And that is essentially what you do: some block sequences are indented further than others.

     
  • Sergey Tyurin

    Sergey Tyurin - 2020-04-09

    Here's the part of the spec with text I quoted, it's in "8.2.1. Block Sequences" (https://yaml.org/spec/1.2/spec.html#id2797382):

    As you can see on this screenshot, the example with "Compact sequence" illustrates this behavior where the block includes the first hypen:

    -·- one # Compact
    ··- two # sequence
    

    So the block starts at column 0. It's indicated by the color, it starts from column 0 on the second line. On the first line the first hypen is not is part of the block, but part of the indendation.

    For example, for indendation of 4 whitespaces it would be:

    -   - one
        - two
    
     

Log in to post a comment.