Menu

#528 ruamel fails to merge more than one pointer to anchor

wont-fix
nobody
None
minor
bug
2025-07-30
2024-09-21
Jeff Saremi
No

I have the following yaml.
And I'm using the following code to preserve illegal keys and also to flatten the yaml by merging anchors where they are pointed by.

However, when I'm using the pointers individually the merge works. If I use more than one pointer the merge fails after the first one.

Code:

import sys
from ruamel.yaml import YAML, CommentedMap, CommentedSeq, SafeConstructor

yaml_data = """
_common_metadata: &common_metadata
  ... # see below
"""
yaml = YAML()
yaml.Constructor.flatten_mapping = SafeConstructor.flatten_mapping
yaml.preserve_quotes = True
yaml.default_flow_style=False
yaml.allow_duplicate_keys = True
yaml.representer.ignore_aliases = lambda x: True 

data = yaml.load(yaml_data)
yaml.dump(data, sys.stdout)

Sample Yaml:

_common_metadata: &common_metadata
  metadata:
    metadata:
      owning_team: "asset-bare-metal-eng"

_api_ingress: &api_ingress

  - port: 9090
    protocol: tcp
    cidr_blocks: !flatten
      - !ref constant::private_cidr_blocks
    self_ingress: true

_resources: &resources
  service:
    api:
      eks_cluster: infra/eks::common:aws_eks_cluster:ecosystem-shared-dev
      eks_healthcheck_ports: [3001, 9090]
      ingress_rules: *api_ingress
      load_balancers:

        - ":::application_load_balancer:api"
    worker:
      eks_cluster: infra/eks::common:aws_eks_cluster:ecosystem-shared-dev
      eks_healthcheck_ports: [3001]
    poller:
      eks_cluster: infra/eks::common:aws_eks_cluster:ecosystem-shared-dev
      eks_healthcheck_ports: [3001]

_dev_dns: &dev_dns
  dns:
    bare-metal-provisioner-dev.cbhq.net:
      records:

        - !ref :::application_load_balancer:api#alb.dns_name

# S3 bucket for bare metal provisioner
_s3_provisioner: &s3_provisioner
  s3:
    provisioner:
      name: !sub "bare-metal-provisioner-{{context:configuration}}"
      object_ownership: "BucketOwnerEnforced"
      backup: false
      block_public_operations: true
      server_side_encryption_aws_managed: true
      policy:
        file: projects/infra/policies/s3-cross-account-bucket-policy.json.erb
        inline: true
        context:
          read_roles:

            - !sub arn:aws:iam::{{constant::account_id}}:role/odin/odin-c3-bare-metal-provisioner-{{context:configuration}}-worker
          write_roles:
            - !sub arn:aws:iam::{{constant::account_id}}:role/odin/odin-c3-bare-metal-provisioner-{{context:configuration}}-worker

# AWS Account
ecosystem-shared-dev:
  development:
    <<: *common_metadata
    <<: *resources
    <<: *dev_dns
    <<: *s3_provisioner

Works:

ecosystem-shared-dev:
  development:
    <<: *common_metadata

Works:

ecosystem-shared-dev:
  development:
    <<: *resources

Does not work:

ecosystem-shared-dev:
  development:
    <<: *common_metadata
    <<: *resources

Does not work:

ecosystem-shared-dev:
  development:
    <<: *resources
    <<: *common_metadata
    <<: *dev_dns
    <<: *s3_provisioner

Discussion

  • Jeff Saremi

    Jeff Saremi - 2024-09-21

    Here's an example of a case that does not work:

    ecosystem-shared-dev:
      development:
        metadata:
          metadata:
            owning_team: "asset-bare-metal-eng"
    
        <<:
          dns:
            bare-metal-provisioner-dev.cbhq.net:
              records:
    
              - !ref :::application_load_balancer:api#alb.dns_name
    
     
  • Anthon van der Neut

    • status: open --> wont-fix
     
  • Anthon van der Neut

    The only thing allow_dupclicate_keys is that it will not complain about additional keys, but instead throw them away.
    I don't know where the misconception comes from that you can preserve duplicate keys (assuming that is what you eman with illigal keys). ruamel.yaml is a YAML parser, what you have is not YAML.

     
  • Gernot Salzer

    Gernot Salzer - 2025-07-29

    As I understand it, the issue seems to be that

    data = yaml.load("""
    ---
    
    - &A
      a: 1
      b: 2
    - &B
      c: 3
      d: 4
    - <<: *A
      <<: *B
    """)
    

    gives an error "duplicate key", apparently because of << occurring twice. I understand that ruamel.yaml may not support it, but this is valid yaml, and << is not a normal key, but an operator.

     
    👎
    1
    • Anthon van der Neut

      Please update your claim with a link to official documentation where it says it is an operator.

       
      👍
      1
  • Gernot Salzer

    Gernot Salzer - 2025-07-29

    You are right, I was misled by the fact that several YAML checkers and YAML-JSON converters accept multiple << keys as "Valid YAML" and merge the referenced dicts successively. Apparently the correct form to specify it is <<: [*A. *B]

     
    • Anthon van der Neut

      There is another issue with accepting multiple mereg keys. Given the way merge keys are defined (insert keys from the merge key value dict(s)), and because the key ordering in mapping is unordered in YAML, there is no deterministic way to get a result from two merges ( the sequence of dicts that can be a value for a merge is, of course ordered).

       
  • Gernot Salzer

    Gernot Salzer - 2025-07-29

    Why I actually came here: I'm trying to find out how to construct a mapping, in Python that contains something like <<: *A or <<: [*A, *B]. So not reading the structure and updating it, but constructing it from scratch. Can you give any pointer to an example (like a test case) or other reference?

     
    • Anthon van der Neut

      There might be a stackoverflow post on that, I am not sure. But the since you can round-trip merge-keys without losing them in ruamel.yaml follow the usual recommendation:
      1) check that the round-trip works, 2) inspect the intermediate result.
      If you look at CommentedMapand how it is constructed/filled in constructor.py:RoundTripConstructor.flatten_mapping. (see how the local merge_map_list is constructed.
      If the yaml parser can construct it from YAML, you can do the same from just Python. The only thing is that there is no guarantee none of the internals will change at any given time. So pin the version you use, and test.
      (These kind of question are better suited for StackOverflow).

       
      👍
      1
      • Gernot Salzer

        Gernot Salzer - 2025-07-30

        Thanks a lot for the pointer to StackOverflow. Your answer to How to change one of the attribute reference to anchor in ruamel.yaml (stackoverflow.com) from 2020 seems to answer also my question. Still have to try it out, but the given example is exactly my case.

         

Log in to post a comment.

MongoDB Logo MongoDB