Menu

Compare XML nodes by XPath

Help
2019-06-04
2020-04-11
  • David Alexis Vargas Álvarez

    Hello, it's me again.

    I'm using the C# version of the framework.

    I've come with another question. I know there's a WithNodeFilter in the DiffBuilder class that allows to define whether to compare nodes or not . But the XmlNode doesn't provide any property nor method to get the node XPath. Basically I need a special node filter that selects the nodes to compare by XPath. There could be more than just one XPath for these cases. I've created a work around by selecting the nodes by XPath and then using the DiffBuilder to create comparisons over every pair found.

    I believe there could be a better way to do this using your framework. Can you help me please?

     
    • Stefan Bodewig

      Stefan Bodewig - 2019-06-08

      Let me first try to illustrate what I think you are trying to do

      You've got two docs like

      <foo>
        <bar>
          <baz>xyzzy</baz>
        </bar>
        <quux id="fred"/>
      </foo>
      
      <foo>
        <bar some-attr="I don't care about">
          <baz>xyzzy</baz>
        </bar>
        <quux id="carl"/>
      </foo>
      

      and you are not interested in differences in foo or bar but only for the pairs of nodes that are selected by /foo/bar/baz and /foo/quux. Is this correct?

      One way to do it, and I think this is what you describe as your current attempt, is to use XMLUnit's XPath support, select the pairs of nodes and run fresh comparisons for each pair.

      The other approach you seem to describe uses just one comparison and filters out everything you don't want. This is way more complex - even if XmlNode provided a way to get hold of the current node's XPath[1]. In the example above your filter would not only have to return true for the baz and quux nodes but also for the foo node (else the comparison would stop at the root and be done) and the bar node (or you'd never reach baz). Of course as you are still not interested in any differences of the foos and bars of the world you'd have to use a DifferenceEvaluator that also knows the XPaths and suppresses all differences for nodes not matching what you are interested in.

      Which leads me to the third option. Don't use a NodeFilter at all but only use a DifferenceEvaluator that downgrades all differences to EQUAL unless the XPath of the comparison detail matches what you are looking for.

      I'd recommend to either use the apprach you currently use or the approach only using DifferenceEvaluator. Which one I'd pick depends on how big the documents are and how many XPath pairs there are. The DifferenceEvaluator approach will spend time and memory comparing nodes you are not interested in. The "separate comparisons" approach is more targetted but may be costly because of XPath evaluations.

      With big documents and only a few XPaths I'd pick "separate comparisons" with smallish documents and many XPaths I'd use DifferenceEvaluator.

      With small documents and only few XPaths it probably doesn't matter and good luck with big documents and many XPaths (actually I'd recommend you try which approach works better for you in that case ;-).

      [1] The lack if such a property has forced me to invent an approach that tracks the current XPath separately as XMLUnit traverses the document. The DOM API is quite a bit older than XPath which could explain why neither .NET's XmlNode nor Java's Node try to bring them together.

       
  • David Alexis Vargas Álvarez

    Thanks for all the help provided!

    I finally decided to use the "separate comparisons" approach, here's a short example of what I've done, hopefully this helps someone in the future:

    private static readonly ComparisonType[] ExcludedComparisons =
        new ComparisonType[] { ComparisonType.CHILD_NODELIST_LENGTH };
    
    private static readonly DifferenceEvaluator DiffEvaluator =
        DifferenceEvaluators.DowngradeDifferencesToEqual(ExcludedComparisons);
    
    public bool CompareNodesByXPath(params string[] nodeXPaths)
    {
        bool result;
        var controlXmlDocument = XDocument.Load(this.ControlFilePath);
        var testXmlDocument = XDocument.Load(this.testFilePath);
        foreach (var nodeXpath in nodeXPaths)
        {
            result = this.CompareNodes(
                controlXmlDocument.XPathSelectElements(nodeXpath),
                testXmlDocument.XPathSelectElements(nodeXpath));
            if (!result)
            {
                break;
            }
        }
    
        return result;
    }
    
    private bool CompareNodes(
        IEnumerable<XElement> controlNodeCollection,
        IEnumerable<XElement> testNodeCollection)
    {
        bool controlMoveNextState;
        bool testMoveNextState;
        Diff diff;
        using (var controlXmlEnumerator = controlNodeCollection.GetEnumerator())
        using (var testXmlEnumerator = testNodeCollection.GetEnumerator())
        {
            do
            {
                controlMoveNextState = controlXmlEnumerator.MoveNext();
                testMoveNextState = testXmlEnumerator.MoveNext();
                if (!(controlMoveNextState || testMoveNextState))
                {
                    if (controlMoveNextState || testMoveNextState)
                    {
                        // One of the enumerators does have items left and the other don't.
                        return false;
                    }
                    else
                    {
                        // Both enumerators have no items left, making them equal.
                        return true;
                    }
                }
    
                // Compare the current nodes.
                diff = DiffBuilder
                    .Compare(Input.FromNode(controlXmlEnumerator.Current))
                    .WithTest(Input.FromNode(testXmlEnumerator.Current))
                    .WithDifferenceEvaluator(DiffEvaluator)
                    .Build();
                if (diff.HasDifferences())
                {
                    // Difference found.
                    return false;
                }
            }
            while (true);
        }
    }
    

    As you can see the example is pretty simple, it just calculates whether two XML have the same values at the specified XPaths.
    @bodewig if you have any feedback regarding how I'm using the DiffBuilder please let me know.
    Again, thank you for your help.

     

Log in to post a comment.

MongoDB Logo MongoDB