too many diffs when new element is inserted

Help
Lou
2008-11-19
2013-03-03
  • Lou

    Lou - 2008-11-19

    I'm running into a problem when a new tag is inserted into my XML.  Say the control XML looks like this:

    <a>1</a>
    <b>1</b>
    <c>1</c>
    <d>1</d>
    <e>1</e>

    The test XML looks like this:

    <a>1</a>
    <b>1</b>
    <z>1</z>
    <c>1</c>
    <d>1</d>
    <e>1</e>

    When run through the diff engine with "theDiff.overrideElementQualifier(new MultiLevelElementNameAndTextQualifier(2,true));" I get diffs detected for z, c, d, and e (it is comparing the wrong control tag to test tag).  Is there any way to get the diff engine to recognize that a tag was inserted and not "mark" all subsequent tags a different?  I also tried the "RecursiveElementNameAndTextQualifier" but it is reporting diffs between totally different tags.  Any thoughts?  Is there another alternative?

     
    • Stefan Bodewig

      Stefan Bodewig - 2008-11-19

      Are these examples real?  If so, you really don't need anything but ElementNameQualifier which is what XMLUnit uses by default.

      Let's see

              String control = "<x>"
                  + "<a>1</a>"
                  + "<b>1</b>"
                  + "<c>1</c>"
                  + "<d>1</d>"
                  + "<e>1</e>"
                  + "</x>";
              String test = "<x>"
                  + "<a>1</a>"
                  + "<b>1</b>"
                  + "<z>1</z>"
                  + "<c>1</c>"
                  + "<d>1</d>"
                  + "<e>1</e>"
                  + "</x>";

              DetailedDiff diff = new DetailedDiff(new Diff(control, test));
              for (java.util.Iterator iter = diff.getAllDifferences().iterator();
                   iter.hasNext(); ) {
                  Difference d = (Difference) iter.next();
                  System.err.println("found "
                                     + (d.isRecoverable() ? "" : "non-")
                                     + "recoverable difference:");
                  System.err.println(d);
              }

      produces

      found non-recoverable difference:
      Expected number of child nodes '5' but was '6' - comparing <x...> at /x[1] to <x...> at /x[1]
      found recoverable difference:
      Expected sequence of child nodes '2' but was '3' - comparing <c...> at /x[1]/c[1] to <c...> at /x[1]/c[1]
      found recoverable difference:
      Expected sequence of child nodes '3' but was '4' - comparing <d...> at /x[1]/d[1] to <d...> at /x[1]/d[1]
      found recoverable difference:
      Expected sequence of child nodes '4' but was '5' - comparing <e...> at /x[1]/e[1] to <e...> at /x[1]/e[1]
      found non-recoverable difference:
      Expected presence of child node 'null' but was 'z' - comparing  at null to <z...> at /x[1]/z[1]

      so you get recoverable differences for c, d and e and non-recoverable ones for x and z.  Looks like he expected result to me.

       
      • Lou

        Lou - 2008-11-20

        Yes, I did present a simplified case, but your example of keeping it simple is proving useful.  The actual XML has many deep nested elements and I think that is requiring me to use the recursive qualifier.  I'm trying a lower tech approach that involves breaking out all of the nested elements into their own docs and then running the diff on each one - I'll post back if this works.

        For reference, here is a more 'complex' example:

        <x>
        <a>aaa</a>
        <b>aaa</b>
        <c>aaa</c>
        <d>
        <da>ddd</da>
        <db>ddd</db>
        <dc>ddd</dc>
        </d>
        <d>
        <da>ddd</da>
        <db>ddd</db>
        <dc>ddd</dc>
        </d>
        <d>
        <da>ddd</da>
        <db>ddd</db>
        <dc>ddd</dc>
        <dd><q>qqq</q><r>rrr</r></dd>
        </d>

        <e>
        <ea>eee</ea>
        </e>

        <f>fff</f>
        <and so on/>
        </x>

        As you can see, there can be some deep nesting (I didn't display all of the levels) and also "lists" of the same element that have to be diff'd by their contents (hence the use of the recursive qualifier).

        -Lou

         
    • Virat Gohil

      Virat Gohil - 2009-04-13

      Hi,

      I too faced the same issue as you. I figured out that the DifferenceEngine.compareNodeList() does the following:

      1. It compiles a list of objects that are comparable (qualifies for comparison).
      2. Once this operation is complete, there will be 3 lists available:
          a. List of comparable objects
          b. list of objects present in control node but absent in test node.
          c. list of objects present in test node but absent in control node.
      3. Compares the comparable objects.
      4. compares the objects in list b against list c (above), even though they do not qualify for comparison.

      I think point 4 is a defect. I downloaded the source code and modified the following:

      Original: DifferenceEngine.java:435
                  if (nextTest == null && !unmatchedTestNodes.isEmpty()) {
                      nextTest = (Node) unmatchedTestNodes.get(0);
                      testIndex = new Integer(testChildren.indexOf(nextTest));
                      unmatchedTestNodes.remove(0);
                  }
      Modify:
                 if (nextTest == null && !unmatchedTestNodes.isEmpty()) {
                      nextTest = (Node) unmatchedTestNodes.get(0);
                      if(elementQualifier.qualifyForComparison((Element)nextControl, (Element)nextTest))
                      {
                          testIndex = new Integer(testChildren.indexOf(nextTest));
                          unmatchedTestNodes.remove(0);
                      }
                      else
                          nextTest=null;
                  }

      HTH,

      Virat

       

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks