What's happening here is that when <error0> is encountered, Saxon no longer knows what state the FSA for the content model of <shiporder> is supposed to be in, so it doesn't attempt to locate schema declarations for the following siblings of error0, and in fact it doesn't report any errors for following siblings of error0 or their subtrees. So in that sense, the errors are related.

In fact the "element declarations consistent" constraint means that it's possible to know the governing type of shipto even though we've departed from the FSA for its parent element, so some smarter error recovery might be possible here. However, error recovery always creates the danger of spurious errors.

But I'll see what I can do to improve it.

Michael Kay
Saxonica

On 22/03/2012 13:35, John Janssen wrote:
Thanks for the feedback so far. That clears up some confusion indeed. I’ll post a test case that is representative of our real cases.
 
My test.xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923">
  <orderperson>John Smith</orderperson>
  <error0>error0</error0>
  <shipto>
    <error1>error1</error1>
    <error2>error2</error2>
    <name>Ola Nordmann</name>
    <address>Verweg 12</address>
    <city>Oslo</city>
    <country>Norway</country>
  </shipto>
  <error3>error3</error3>
  <item>
    <title>title</title>
    <note>note</note>
    <quantity>1</quantity>
    <price>10.00</price>
  </item>
  <item>error4</item>
</shiporder>
 
As you can see I introduced some errors (five in total, named error0 – error4) on various depths within the XML tree. I will refrain from posting the whole XSD, but it’s a pretty simple one with the shiporder allowing 1 orderperson, 1 shipto and multiple item elements, and each of those having the internal elements as displayed above, aside from the errors of course. Without the errors this validates fine.
 
When I run this through the validator (command line) I get only the first error reported:
 
>Validation error at /shiporder[1]/error0[1] on line 4 column 11 of test.xml:
>  In content of element <shiporder>: The content model does not allow element <error0> to
>  appear here.
 
The other errors are all unreported, while I would say that they are all unrelated to eachother (aside from being in the same depth of the XML tree in some cases). Error4 is even of a completely different type (disallowed content, instead of element not allowed), and is apparently filtered out as well.
 
Expanding on this test case a bit:
- When I remove error0: error1 and error3 are reported.
- When I remove error0 and error1: error2 and error3 are reported
- When I remove error0, error1 and error3: error2 and error4 are reported.
 
Is there any way that I could get all 5 errors reported at once?
 
Regards,
John
 
 
Sent: Wednesday, March 21, 2012 10:17 AM
Subject: Re: [saxon] Can I produce a full list of validation problems with Saxon Validate?
 

When you run validation from the command line then the validator will in general continue after an error, and will report all the errors that it finds. There's no need to set any special options. (The message you cited was using the validation API, which is different).

However, the validator does try to avoid reporting spurious errors. For example if you misspell an element name, it tries to avoid giving you two errors, one for the presence of the unknown element and one for the absence of the expected element. Similarly, if it can't find a schema declaration for a particular element, it generally won't give you any errors that relate to the subtree below that element. This means that if it can't find a top-level declaration for the root element, you will typically only get one error.

If there are cases where you think multiple errors should be reported but aren't, then I would be interested to see the test case - it's only by looking at real error cases arising in the wild that one can fine-tune the diagnostics.

Michael Kay
Saxonica


On 20/03/2012 12:59, John Janssen wrote:
Hello,
 
Apologies if this has already been asked, but the markmail searchable archives seem to be unreachable.
 
During our XML processing we plan to validate user input XML against an XSD, using the following command line call:
 
/usr/bin/java –cp /tools/saxon/saxon9ee.jar com.saxonica.Validate –s:myinput.xml –xsd:myschema.xsd
 
Unfortunately during testing this only returned the first error, and never a complete list of problems, which we would very much prefer.
 
which seems to relate to the same problem, only in a C#.NET context. Delving through the documentation I found the suggested solution there should translate to the command line option “--validation-warnings”, but adding this to my command line call does not seem to produce a more complete list of validation error either.
 
So does this not work for command line, is my research wrong, or is a complete list of errors as output just not possible?
Maybe someone could suggest another solution in that case?
 
Regards,
John


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure


_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help