[Sax-devel] Reporting of well-formed content in malformed documents
Brought to you by:
dmegginson
From: Elliotte H. <el...@me...> - 2003-10-07 03:22:31
|
Consider a document such as the following: <root> <child1/> <child2> </root> Clearly it is malformed because the </child2> end-tag is missing. However, a streaming parser using SAX will still report startDocument(), startElement(root), characters(), startElement(child1), endElement(child1), characters(), and startElement(child2) before the malformedness is detected and a SAXParseException is thrown. Or will it? In my tests with Xerces-J 2.5 I'm getting only startDocument() before a SAXParseException is thrown. The XML spec does not require a parser to throw away content found before the first well-formedness error. However, Xerces seems to be throwing it away for me, and I can't find anything in the SAX spec to say this is wrong. Not having guaranteed access to the well-formed initail section of the document really decreases the usefulness of a streaming API. For my app, I would like to guarantee that all content before the first well-formedness error is reported via the normal mechanisms. is this possible? Is this a good idea? Should SAX be rewritten to require this behavior? Or am I out to sea? Thoughts? -- Elliotte Rusty Harold |