From: Tatu S. <cow...@ya...> - 2006-10-03 04:06:30
|
--- Mark Swanson <ma...@Sc...> wrote: ... > Well, allow me try to make a stronger case: > > In the real world, data isn't perfect. One can > either toss back illegal > data or try your best to work with it. A common best > practice is to be > as friendly and as considerate as you can to the > incoming data, and > produce the most accurate and conforming (to > whatever standard) outgoing > data. This is common practice for some applications ("be conservative at what you send, liberal at what you accept"), but notably not with xml processing. > Currently, VTD fails here wrt the incoming data; > there is no way I can > tell VTD to be considerate and lenient towards the > failings of the > incoming data stream. As a developer, I believe it > is my choice - not > VTD's - whether or not I wish to be considerate and > lenient to the > incoming data. If I wish to consider a form feed You could argue this, but it is worth noting that none of the actual conformant xml parsers allow things like characters that are _illegal_ in xml content: try same content with, say, Xerces, and see what I mean. This is because xml specification is very clear not only on what is considered legal for well-formed documents, but also how conformining processing applications (parsers) are to deal with things that are not. Specifically they are not allowed to resolve fatal problems, and must report these fatal errors to the end application. Having said that, I would think that if specific lenient modes could be enabled (and were disabled by default), that might be reasonable. ... > 4. (at least some of) VTDs competitors already scrub > the data by > default. The XPP (Xml Pull Parser) already does > this. In fact, I was in > the middle of switching away from XPP when I ran > into this VTD > limitation. For my particular use case, using VTD is > now slower than XPP > because of this scrubbing issue. Really? I wouldn't have though xpp would do that, since I thought it aims to be an actual xml conformant parser... What kind of scrubbing does it do? > A single if{} could allow the pedantic behaviour (as > it is currently) or > a more friendly and considerate (I would argue more > industry standard) > behaviour. Which industries rely on broken xml content being processed? (an honest question, no sarcasm intended) -+ Tatu +- __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |