Thread: [Sax-devel] undeclared attribute types
Brought to you by:
dmegginson
From: Elliotte R. H. <el...@me...> - 2002-04-19 12:11:15
|
Consider the following simple XML document: <Greeting importance="1"> Hello! </Greeting> Notice that this document has no DTD. What is the type of the importance attribute? I've been saying that in cases like this, importance has type CDATA by default. However, Laurent Bihanic pointed out to me that the only place in the XML spec that indicates this really only refers to attribute value normalization, not to anything else. This is in section 3.3.3 which states: All attributes for which no declaration has been read should be treated by a non-validating processor as if declared CDATA. This does *not* say that such attributes do have type CDATA, and I can't find anything else in the spec that says they do. (The restriction to non-validating processors here is IMO an erratum in the spec which I just reported to the editors.) I think we should delete the UNDECLARED_ATTRIBUTE field and use CDATA as the default attribute type. The SAX JavaDoc does says this. See http://www.saxproject.org/apidoc/org/xml/sax/Attributes.html#getType(int) which states: The attribute type is one of the strings "CDATA", "ID", "IDREF", "IDREFS", "NMTOKEN", "NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION" (always in upper case). If the parser has not read a declaration for the attribute, or if the parser does not report attribute types, then it must return the value "CDATA" as stated in the XML 1.0 Recommentation (clause 3.3.3, "Attribute-Value Normalization"). However, it's not at all clear that SAX is correct here. I think SAX may need to change in order to distinguish between genuine CDATA attributes and undeclared or untyped attributes. Thoughts? -- +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | el...@me... | Writer/Programmer | +-----------------------+------------------------+-------------------+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.cafeconleche.org/books/bible2/ | | http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://www.cafeaulait.org/ | | Read Cafe con Leche for XML News: http://www.cafeconleche.org/ | +----------------------------------+---------------------------------+ |
From: David B. <da...@pa...> - 2002-04-21 20:57:41
|
OK, I'll add isDeclared() methods in Attributes2: Three calls, just like isSpecified(). Clearly for any any attribute A if (!A.isSpecified() || !"CDATA".equals (A.getType())) assert (A.isDeclared()); The Attributes2Impl copy constructor can use that for some smarter behavior. I added notes about the {uri, localName} accessors there ... DTD's don't "understand" namespaces, so declarations and defaults apply to the qName, and the "uri" might not have come from the DTD. - Dave |
From: David B. <da...@pa...> - 2002-04-22 01:45:10
|
> > The big effort is really in updating parsers to use those > > codes ... :) Clearly can't happen in the absence of at > > least a first whack at those codes ... and testing will > > be interesting too. > > This is my common suggestion (maybe I should add an RFE huh?) but I imagine > we will either need come up with a set of tests that indicate which error > should be thrown or modify current ones. I'd expect to see the xmlconf SAX/XML harness get updated. The rules being tested are supposed to be called out there, and that harness is already set up to compare things and generate a report. Of course, with that large a test database, and no automated testing of that part of the test data, I'm absolutely certain that data itself is "dirty", so it'll need to be cleaned up. > It might be interesting to add > Line/Column info that indicates *where* the error should be thrown (though > some past discussions have indicated that Line/Column may not always be an > accurate concept). Actually the line/column info was never guaranteed to be exact; quite the opposite in fact. I'm sure most parser maintainers would provide fixes given good test cases, but it's not been a requirement. > I know that this can't exactly be legislated (and > probably shouldn't be), but having a benchmark/best practice might be > useful. I have tested several parsers and the results were pretty > astonishing. Tell me about it ... :) - Dave |
From: Rob L. <ro...@el...> - 2002-04-19 13:44:13
|
Elliotte Rusty Harold wrote > Consider the following simple XML document: > > <Greeting importance="1"> > Hello! > </Greeting> > > Notice that this document has no DTD. What is the type of the > importance attribute? > > I've been saying that in cases like this, importance has type CDATA > by default. However, Laurent Bihanic pointed out to me that the only > place in the XML spec that indicates this really only refers to > attribute value normalization, not to anything else. This is in > section 3.3.3 which states: > > All attributes for which no declaration has been read should be > treated by a non-validating processor as if declared CDATA. > > This does *not* say that such attributes do have type CDATA, and I > can't find anything else in the spec that says they do. (The > restriction to non-validating processors here is IMO an erratum in > the spec which I just reported to the editors.) I think we should > delete the UNDECLARED_ATTRIBUTE field and use CDATA as the default > attribute type. > > The SAX JavaDoc does says this. See > http://www.saxproject.org/apidoc/org/xml/sax/Attributes.html#getType(int) > which states: > > The attribute type is one of the strings "CDATA", "ID", "IDREF", > "IDREFS", "NMTOKEN", "NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION" > (always in upper case). > > If the parser has not read a declaration for the attribute, or if the > parser does not report attribute types, then it must return the value > "CDATA" as stated in the XML 1.0 Recommentation (clause 3.3.3, > "Attribute-Value Normalization"). > > However, it's not at all clear that SAX is correct here. I think SAX > may need to change in order to distinguish between genuine CDATA > attributes and undeclared or untyped attributes. Thoughts? I can obviously see the distinction between CDATA attributes and undeclared ones, but there are probably few classes of application that care. It's certainly doubtful that it would be worth a disruptive change to the existing implementations. However, it may be an idea to add "isDeclared" or some such method to the new Attributes2 interface before it gets widely deployed. ~Rob |
From: David B. <da...@pa...> - 2002-04-19 14:33:17
|
> > If the parser has not read a declaration for the attribute, or if the > > parser does not report attribute types, then it must return the value > > "CDATA" as stated in the XML 1.0 Recommentation (clause 3.3.3, > > "Attribute-Value Normalization"). > > > > However, it's not at all clear that SAX is correct here. I think SAX > > may need to change in order to distinguish between genuine CDATA > > attributes and undeclared or untyped attributes. Thoughts? > > I can obviously see the distinction between CDATA attributes and undeclared > ones, but there are probably few classes of application that care. It's > certainly doubtful that it would be worth a disruptive change to the > existing implementations. However, it may be an idea to add "isDeclared" or > some such method to the new Attributes2 interface before it gets widely > deployed. That'd be a possibility ... any strong interest in seeing that get added? As John Cowan pointed out, validating parsers will see an error() call. Not that you can tell anything from that yet ... like which VC is involved, which clause (anyone care?), which element, attribute, etc. - Dave |
From: Elliotte R. H. <el...@me...> - 2002-04-19 15:10:51
|
At 7:31 AM -0700 4/19/02, David Brownell wrote: >> I can obviously see the distinction between CDATA attributes and undeclared >> ones, but there are probably few classes of application that care. It's >> certainly doubtful that it would be worth a disruptive change to the >> existing implementations. However, it may be an idea to add "isDeclared" or >> some such method to the new Attributes2 interface before it gets widely >> deployed. > >That'd be a possibility ... any strong interest in seeing that get added? > It would be useful for JDOM, which does attempt to distinguish between undeclared and CDATA, but can't really do that when it sits on top of SAX. >As John Cowan pointed out, validating parsers will see an error() call. >Not that you can tell anything from that yet ... like which VC is involved, >which clause (anyone care?), which element, attribute, etc. > I most certainly care about that. It would be *hugely* useful in debugging. Of course it would first require a major effort to define standard reporting codes for different validity violations. -- +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | el...@me... | Writer/Programmer | +-----------------------+------------------------+-------------------+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.cafeconleche.org/books/bible2/ | | http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://www.cafeaulait.org/ | | Read Cafe con Leche for XML News: http://www.cafeconleche.org/ | +----------------------------------+---------------------------------+ |
From: Rob L. <ro...@el...> - 2002-04-19 16:11:11
|
Elliotte Rusty Harold wrote: > > > >That'd be a possibility ... any strong interest in seeing that get added? > > > > It would be useful for JDOM, which does attempt to distinguish > between undeclared and CDATA, but can't really do that when it sits > on top of SAX. > It is possible (but a pain) for the JDOM builder to access this information if it uses the DeclHandler interface. Of course not every SAX parser supports this, but then not every SAX parser supports Attributes2 ;-) I'd say that it wouldn't hurt much to add this to Attributes2 so why not? - but that sentiment probably flies in the face of the founding principles of SAX which seem to be more in the line with "keep it out unless it's absolutely necessary". ~Rob |
From: David B. <da...@pa...> - 2002-04-21 20:59:07
|
> > ... validating parsers will see an error() call. > >Not that you can tell anything from that yet ... like which VC is involved, > >which clause (anyone care?), which element, attribute, etc. > > I most certainly care about that. It would be *hugely* useful in > debugging. Of course it would first require a major effort to define > standard reporting codes for different validity violations. The big effort is really in updating parsers to use those codes ... :) Clearly can't happen in the absence of at least a first whack at those codes ... and testing will be interesting too. Since this is on the list of RFEs, I'll put back an initial version of the API and exception ID definitions, along the lines of what has been discussed here in the past. (IDs are URIs, new SAXParseException methods). - Dave |
From: Jeff R. <jef...@de...> - 2002-04-21 21:12:17
|
> The big effort is really in updating parsers to use those > codes ... :) Clearly can't happen in the absence of at > least a first whack at those codes ... and testing will > be interesting too. > > Since this is on the list of RFEs, I'll put back an initial > version of the API and exception ID definitions, along > the lines of what has been discussed here in the past. > (IDs are URIs, new SAXParseException methods). This is my common suggestion (maybe I should add an RFE huh?) but I imagine we will either need come up with a set of tests that indicate which error should be thrown or modify current ones. It might be interesting to add Line/Column info that indicates *where* the error should be thrown (though some past discussions have indicated that Line/Column may not always be an accurate concept). I know that this can't exactly be legislated (and probably shouldn't be), but having a benchmark/best practice might be useful. I have tested several parsers and the results were pretty astonishing. Cheers, Jeff Rafter Defined Systems http://www.defined.net XML Development and Developer Web Hosting |
From: Rob L. <ro...@el...> - 2002-04-22 10:01:26
|
> > >Not that you can tell anything from that yet ... like which VC is involved, > > >which clause (anyone care?), which element, attribute, etc. > > > > I most certainly care about that. It would be *hugely* useful in > > debugging. I'm very interested to know how this would help debugging. > > The big effort is really in updating parsers to use those > codes ... :) Clearly can't happen in the absence of at > least a first whack at those codes ... and testing will > be interesting too. > > Since this is on the list of RFEs, I'll put back an initial > version of the API and exception ID definitions, along > the lines of what has been discussed here in the past. I fear that this feature may find its way into SAX 2.1 without adequate consideration. So far I can't see a great deal of value in this, yet adding it implies a fair amount of work for all parsers that want to achieve compliance with SAX 2.1 when the time comes. I can see that these additions to the API would aid automated SAX conformance testing, but is this a good enough reason to justify the cost? Also, adding error ids will create a whole new class of conformance, taking away some of the (legitimate) choices about which error is the most meaningful for a SAX parser to report. From the documentation [http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/sax/sax2/src/org/xml/sax/pac kage.html]: > "For example, [applications] can assemble (and translate) > catalogs of messages that make sense to their users." I don't think this will be a realistic option. Personally I would expect the SAX Parser to provide a translation based on the user's language preferences. If you base message translation on SAX error ids you are going to also have to consider adding replacement parameter information. e.g. "element {1} not expected here, expecting {2}" >"In some cases they can write code that uses knowledge > of the errors to decide how to proceed most effectively. > For example, some validity errors might be of no concern, > while others might need to be treated as fatal." I think the most interesting benefit here is that the application may decide how to proceed from a VC. I don't think the application will have many useful choices following a well-formedness error. It may help reduce the scope of this debate if we concentrate on creating IDs for validity constrains only. Finally, I have a concern about the practice of using the SAX source tree for exploring new API features in this way. My concerns may be unfounded, but don't you think this practice is likely to create a "fait a complit" even though the changes haven't been fully ratified? Kind regards ~Rob -- Rob Lugt ElCel Technology http://www.elcel.com/ |
From: David B. <da...@pa...> - 2002-04-22 15:37:28
|
> I'm very interested to know how this would help debugging. FWIW I see applications benefitting a lot more from it, though admittedly _any_ relevant information can help someone debugging. Today, without any such IDs, all you can tell given a specific exception is "something broke" ... and depending on how you get the exception (ErrorHandler vs catch), you might be able to learn the severity (fatal, nonfatal, warning). Given a usable scheme for such IDs, you can actually know something about what's wrong. That means at least the possibility of doing something more intelligent than applications doing the metaphorical equivalent of throwing up their hands ("complain to some human"). > > Since this is on the list of RFEs, I'll put back an initial > > version of the API and exception ID definitions, along > > the lines of what has been discussed here in the past. > > I fear that this feature may find its way into SAX 2.1 without adequate > consideration. It got plenty of discussion later last year, and nobody flagged any problems with it back than as I recall. In fact someone even filed an RFE on the topic. I'm sorry, I won't have the time to maintain "constant" consideration ... or even do much to refresh it! :) > So far I can't see a great deal of value in this, yet adding > it implies a fair amount of work for all parsers that want to achieve > compliance with SAX 2.1 when the time comes. Trivial compliance: don't provide the IDs. As it says: Not all parsers will choose to provide all these IDs, but those that provide any must only use the exception IDs defined by SAX. Useful compliance would be more work, true. > I can see that these additions to the API would aid automated SAX > conformance testing, but is this a good enough reason to justify the cost? Whose costs are you concerned with? I'm not clear why you'd object to applications being able to more accurately identify "WHAT broke". I am clear that parser implementors might not be wholly keen on any more work whatsoever. That's part of why "trivial compliance" is an option. > Also, adding error ids will create a whole new class of conformance, taking > away some of the (legitimate) choices about which error is the most > meaningful for a SAX parser to report. Not as written -- not at all. As for "most meaningful" there's clearly flexibility about how specific to be. (Maybe even too much.) While as for "choices", surely the only legitimate errors to report are as defined in the XML REC. > From the documentation > [http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/sax/sax2/src/org/xml/sax/pac > kage.html]: > > > "For example, [applications] can assemble (and translate) > > catalogs of messages that make sense to their users." > > I don't think this will be a realistic option. Personally I would expect > the SAX Parser to provide a translation based on the user's language > preferences. Not all parsers can/will do that. And in any case, the issue is that applications need to distinguish errors ... there's already a user targetted mechanism for that (exception messages). You need to think about the error as distinct from its diagnostic. Presenting a diagnostic is ONE way to handle errors, and not even a very good one: it's a "punt all problems to the user" kind of solution, which users don't want in all cases. > >"In some cases they can write code that uses knowledge > > of the errors to decide how to proceed most effectively. > > For example, some validity errors might be of no concern, > > while others might need to be treated as fatal." > > I think the most interesting benefit here is that the application may decide > how to proceed from a VC. I don't think the application will have many > useful choices following a well-formedness error. It may help reduce the > scope of this debate if we concentrate on creating IDs for validity > constrains only. I don't see any benefit to saying the IDs can _only_ apply to validity errors. Though I suspect you'd be right that providing them in those cases will give the highest initial return (for applications and thus for parser implementors). > Finally, I have a concern about the practice of using the SAX source tree > for exploring new API features in this way. My concerns may be unfounded, > but don't you think this practice is likely to create a "fait a complit" > even though the changes haven't been fully ratified? Well, I do think this feature and its need has gotten a fair amount of discussion already. Not in the last month, but later last year there was some significant discussion on the topic. (Check list archives.) And also RFE [486006] (just closed), filed by someone who wasn't an active participant in that debate, which had been hanging out with this exact resolution (read it!) for about four months... it's not as if this change hasn't been telegraphed in advance, to this list. You seem to think that there should be even more "ratification", and I'm at a loss to see why. Could you elaborate? For me, this was just (finally) taking an item off a very publicly created and maintained "TO DO" list, yet it seems you may feel otherwise. - Dave |
From: Rob L. <ro...@el...> - 2002-04-22 20:56:23
|
> > I'm very interested to know how this would help debugging. > > FWIW I see applications benefiting a lot more from it, > though admittedly _any_ relevant information can help > someone debugging. > > Today, without any such IDs, all you can tell given a specific > exception is "something broke" ... and depending on how > you get the exception (ErrorHandler vs catch), you might be > able to learn the severity (fatal, nonfatal, warning). The only errors that automatically get translated into thrown exceptions (when there is no error handler) are fatal errors, so that gives the application a big clue. However, for the sake of discussion, let's assume that the application has no control over what the error handler does. In this case, yes, the severity of the error is lost when the application catches the SAXException. If this information is worth preserving (which I think it probably is), then regardless of what we think about the use of error IDs, it may be an idea to to make this available by adding getSeverity() to SAXParseException. > > > Since this is on the list of RFEs, I'll put back an initial > > > version of the API and exception ID definitions, along > > > the lines of what has been discussed here in the past. > > > > I fear that this feature may find its way into SAX 2.1 without adequate > > consideration. > > It got plenty of discussion later last year, and nobody flagged > any problems with it back than as I recall. I don't think it really got that much attention. As far as I can see, the last time it was discussed was in November - and your last comment was http://www.geocrawler.com/archives/3/13179/2001/11/50/6984651/. If this doesn't flag problems what does? ;-) > Whose costs are you concerned with? - the time involved to analyse the errors and agree on a largely compatible specification for each one - the development time (if any) to retro-fit the new Ids into existing parsers - the runtime and memory cost of translating internal error codes to a table of URI strings Perhaps none of these will be very significant, but I would say that the analysis for this change request is probably greater than any other requests that have been made to-date. > I'm not clear why you'd > object to applications being able to more accurately identify > "WHAT broke". I don't object on principle. I just want to be sure there are real-world applications for this information before committing resources to it. > > Not as written -- not at all. As for "most meaningful" there's clearly > flexibility about how specific to be. (Maybe even too much.) While > as for "choices", surely the only legitimate errors to report are as > defined in the XML REC. You know that a single invalid token could trigger a number of WFCs. Different parsers may give priority to different errors. If SAX conformance is expressed in terms of error URIs, there is an implication that all parsers will have to agree on which error is most significant in a given situation. > > Well, I do think this feature and its need has gotten a fair amount of > discussion already. Not in the last month, but later last year there > was some significant discussion on the topic. (Check list archives.) Yes, I've checked them. I can see that the issue was discussed in November - but I got the impression that more questions were raised than answers given. I also got the impression that you considered there to be a number of difficulties to be overcome before the change could be implemented. Perhaps I read it wrongly? > You seem to think that there should be even more "ratification", and > I'm at a loss to see why. Could you elaborate? For me, this was > just (finally) taking an item off a very publicly created and maintained > "TO DO" list, yet it seems you may feel otherwise. Sure, I don't think this item should have gone on the "to do" list. You obviously think otherwise. I don't think a good case has been made for the proposal - you do. Neither of us is right or wrong - we just have different perspectives. One big difference in our positions is that you are able to make changes to SAX just because you think it's right, whereas I (and everyone else) have to argue our point - sometimes in vain ;-( That said, I think you're doing a good job. All the best, ~Rob |
From: David B. <da...@pa...> - 2002-04-27 16:51:39
|
> > > I'm very interested to know how this would help debugging. > > > > FWIW I see applications benefiting a lot more from it, > > though admittedly _any_ relevant information can help > > someone debugging. > > > > Today, without any such IDs, all you can tell given a specific > > exception is "something broke" ... and depending on how > > you get the exception (ErrorHandler vs catch), you might be > > able to learn the severity (fatal, nonfatal, warning). > > The only errors that automatically get translated into thrown exceptions > (when there is no error handler) are fatal errors, so that gives the > application a big clue. Only in some cases -- like when they happen to _know_ that no other component installed an ErrorHandler. Which is NOT the type of global constraint I like to require in modular applications. Also, a typical usage is to just provide an ErrorHandler.error() that throws its argument ... at which point said "clue" has been discarded. Applications also can't treat severity as exposing "what broke". I mentioned severity as an example of information that's now partially exposed. > . If this information is worth preserving (which I > think it probably is), then regardless of what we think about the use of > error IDs, it may be an idea to to make this available by adding > getSeverity() to SAXParseException. Might be reasonable. It could return constants (interned strings) like "error", "fatalError", and "warning". Or null, for parsers that don't fill in such info. I'd expect that most non-XML faults would map to "error" ... the RDF or XLink usage of SAXParseException that David Megginson mentioned, for example. Is that a fair expectation? Should a spec for such a method stipulate that other "severity" values are illegal? And I'm assuming a write-once setSeverity(), too; though that could be inferred when setting at least some error IDs. > > It got plenty of discussion later last year, and nobody flagged > > any problems with it back than as I recall. > > I don't think it really got that much attention. How much is enough for you, though? :) I thought the thread showed a fair level of interest. Not comparable to flamewars on namespaces and similarly contentions issues, which is a decided feature IMO. I think you were the only person who had any notable skepticism about having the functionality. > As far as I can see, the > last time it was discussed was in November - and your last comment was > http://www.geocrawler.com/archives/3/13179/2001/11/50/6984651/. If this > doesn't flag problems what does? ;-) The only issue it identifies is the "how to report this error" policy. Though I did criticize a non-ID based proposal (adding a few hundred exception subclasses :). That'll be an issue for some kinds of syntax violations (WF-ness that violates some grammar rule) but not for _any_ other cases. > > Whose costs are you concerned with? > > - the time involved to analyse the errors and agree on a largely compatible > specification for each one All (!!) of those costs have been paid already as part of the XML REC. Or if not, the problem is a W3C problem, in terms of ambiguity in that REC. Though if you want to cross the line from "what is the error" to giving specifics like which attribute/element/entity/... names were involved, then I'd agree. That could be added later, if some way could be found to get useful consensus. > - the development time (if any) to retro-fit the new Ids into existing > parsers > - the runtime and memory cost of translating internal error codes to a table > of URI strings As I pointed out, the cost here can be zero. Although I think that it's clear that, as you said, there's a clear application benefit to providing these IDs for validation errors, so I'd expect at least those costs to be incurred by some parsers. > I don't object on principle. I just want to be sure there are real-world > applications for this information before committing resources to it. Understood. There's also the "how much to throw at it" ... and I was pleased by your observation about VC violations being particularly valuable. > > Not as written -- not at all. As for "most meaningful" there's clearly > > flexibility about how specific to be. (Maybe even too much.) While > > as for "choices", surely the only legitimate errors to report are as > > defined in the XML REC. > > You know that a single invalid token could trigger a number of WFCs. Well, no ... "invalid" tokens would trigger VCs, and if some token triggers a WFC it's clearly only that one WFC. Both of those are easily testable; no parser has any choices to make there, only that VC or WFC could have been violated. > Different parsers may give priority to different errors. If SAX conformance > is expressed in terms of error URIs, there is an implication that all > parsers will have to agree on which error is most significant in a given > situation. The only ambiguity of which I'm aware relates to the policy of which grammar production to report as violated. One doesn't need to test SAX conformance for those cases ... though it'd be good to have a way to evaluate how widely parse policies there diverge. (If it's too much, apps won't be able to make effective use of "rule-*" IDs.) > I also got the impression that you considered there to be a > number of difficulties to be overcome before the change could be > implemented. Perhaps I read it wrongly? Read wrongly. See above ... though I admit that I only recently formulated the policy issue as specific to the rule-* codes; the wfc-* and vc-* codes have no such issues. (And neither would the cases specified by must/should/may prose in the body of the REC, which don't currently have IDs.) > Neither of us is right or wrong - we just have different > perspectives. One big difference in our positions is that you are able to > make changes to SAX just because you think it's right, whereas I (and > everyone else) have to argue our point - sometimes in vain ;-( > > That said, I think you're doing a good job. Thanks ... one of the roles of a maintainer is sometimes to make a decision even knowing that there will be a few voices that disagree. (As I expect you know from personal experience!) Pushback (such as yours) is a healthy thing, and I'm glad to get it. When well done, it helps focus and clarify ... to everyone's benefit! - Dave |
From: Rob L. <ro...@el...> - 2002-04-28 20:16:03
|
> > . If this information is worth preserving (which I > > think it probably is), then regardless of what we think about the use of > > error IDs, it may be an idea to to make this available by adding > > getSeverity() to SAXParseException. > > Might be reasonable. It could return constants (interned strings) > like "error", "fatalError", and "warning". Or null, for parsers that > don't fill in such info. > > I'd expect that most non-XML faults would map to "error" ... the > RDF or XLink usage of SAXParseException that David Megginson > mentioned, for example. Is that a fair expectation? Should a spec > for such a method stipulate that other "severity" values are illegal? I'm not sure. I think you're right that most application errors would map to "error". The obvious problem with allowing the permitted values to be extended is that the poor application that catches the exception may not know what to make of it. If it is decided that the permitted values cannot be extended, this would imply that C++ implementations could make this an enum. > > And I'm assuming a write-once setSeverity(), too; though that > could be inferred when setting at least some error IDs. Would you allow setSeverity() and setExceptionId() to be called on the same instance? If so, how would you ensure consistency? > I thought the thread showed a fair level of interest. Not comparable > to flamewars on namespaces and similarly contentions issues, which > is a decided feature IMO. I think you were the only person who had > any notable skepticism about having the functionality. Perhaps that's why I thought it wasn't widely enough discussed. But as I still seem to be the lone dissenter, I'll waste no more of your time with my arguments . ~Rob |
From: David B. <da...@pa...> - 2002-05-01 18:47:08
|
> > > getSeverity() to SAXParseException. > > > > Might be reasonable. It could return constants (interned strings) > > like "error", "fatalError", and "warning". Or null, for parsers that > > don't fill in such info. [ ... ] > > If it is decided that the permitted values cannot > be extended, this would imply that C++ implementations could make this an > enum. That'd be my tendancy, though it bothers me slightly ... though I've never seen additional classifications be more than a distruction, there are some folk who like to spend time classifying things. I take it you're seriously suggesting that such a method be added. If so, would you file an RFE so the list discussion isn't the only record? Do we have other opinions on this topic, pro or con? > > And I'm assuming a write-once setSeverity(), too; though that > > could be inferred when setting at least some error IDs. > > Would you allow setSeverity() and setExceptionId() to be called on the same > instance? If so, how would you ensure consistency? Maybe it'd be better to have them be set by the same call. I think ensuring consistency must fundamentally be the responsibility of whoever provides that information ... only the folk defining the IDs can be authoritative about what they mean, and SAX should not embed such a database. - Dave |
From: Rob L. <ro...@el...> - 2002-05-08 10:51:17
|
> > I take it you're seriously suggesting that such a method be added. If so, > would you file an RFE so the list discussion isn't the only record? > I would like this suggestion to have wider (vocal) support before being added to SAX. However, I have added an RFE as a place-holder. RFE #553670 ~Rob |
From: Yuval O. <yu...@bl...> - 2002-05-08 18:56:31
|
Thanks for posting the RFE; it gave me a nice summary. I like the severity idea but think it would be better to use integer constants instead of strings: WARNING, ERROR, and FATAL_ERROR. That would be more consistent with the JDK (e.g. java.util.logging package) and would also prevent spelling mistakes. Yuval > -----Original Message----- > From: sax...@li... > [mailto:sax...@li...]On Behalf Of Rob Lugt > Sent: Wednesday, May 08, 2002 3:52 AM > To: David Brownell; sax...@li... > Subject: Re: [Sax-devel] SAX exception IDs > > > > > > I take it you're seriously suggesting that such a method be > added. If so, > > would you file an RFE so the list discussion isn't the only record? > > > > I would like this suggestion to have wider (vocal) support before being > added to SAX. However, I have added an RFE as a place-holder. RFE #553670 > > ~Rob > > > _______________________________________________________________ > > Have big pipes? SourceForge.net is looking for download mirrors. We supply > the hardware. You get the recognition. Email Us: ban...@so... > _______________________________________________ > List: sax-devel, sax...@li... > See: http://www.saxproject.org/ > https://lists.sourceforge.net/lists/listinfo/sax-devel |
From: Rob L. <ro...@el...> - 2002-05-08 20:17:30
|
I tend to agree, as I come to this from a C++ perspective and would choose an enum value for this purpose. However, I can think of two reasons why string values may be preferable:- 1) If we decide that the allowed values are extensible, then string values can convey more meaning than integers 2) Attributes.getType() sets a precedent as it also returns a string where an integer could have been used As an aside, if the value is extensible then how is this extensibility managed? How would we avoid naming conflicts? This would tend to indicate that a URI should be used instead of plain strings. ~Rob ----- Original Message ----- From: "Yuval Oren" <yu...@bl...> To: <sax...@li...> Sent: 08 May 2002 19:56 Subject: RE: [Sax-devel] SAX exception IDs > Thanks for posting the RFE; it gave me a nice summary. > > I like the severity idea but think it would be better to use integer > constants instead of strings: WARNING, ERROR, and FATAL_ERROR. That would be > more consistent with the JDK (e.g. java.util.logging package) and would also > prevent spelling mistakes. > > Yuval > > > -----Original Message----- > > From: sax...@li... > > [mailto:sax...@li...]On Behalf Of Rob Lugt > > Sent: Wednesday, May 08, 2002 3:52 AM > > To: David Brownell; sax...@li... > > Subject: Re: [Sax-devel] SAX exception IDs > > > > > > > > > > I take it you're seriously suggesting that such a method be > > added. If so, > > > would you file an RFE so the list discussion isn't the only record? > > > > > > > I would like this suggestion to have wider (vocal) support before being > > added to SAX. However, I have added an RFE as a place-holder. RFE #553670 > > > > ~Rob > > > > > > _______________________________________________________________ > > > > Have big pipes? SourceForge.net is looking for download mirrors. We supply > > the hardware. You get the recognition. Email Us: ban...@so... > > _______________________________________________ > > List: sax-devel, sax...@li... > > See: http://www.saxproject.org/ > > https://lists.sourceforge.net/lists/listinfo/sax-devel > > > _______________________________________________________________ > > Have big pipes? SourceForge.net is looking for download mirrors. We supply > the hardware. You get the recognition. Email Us: ban...@so... > _______________________________________________ > List: sax-devel, sax...@li... > See: http://www.saxproject.org/ > https://lists.sourceforge.net/lists/listinfo/sax-devel > > |
From: Elliotte R. H. <el...@me...> - 2002-05-09 15:58:18
|
At 11:56 AM -0700 5/8/02, Yuval Oren wrote: >I like the severity idea but think it would be better to use integer >constants instead of strings: WARNING, ERROR, and FATAL_ERROR. That would be >more consistent with the JDK (e.g. java.util.logging package) and would also >prevent spelling mistakes. > How about a type-safe enum instead of either one; e.g. public class Error { private Error() {}; public static final Error FATAL_ERROR = new Error(); etc... -- +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | el...@me... | Writer/Programmer | +-----------------------+------------------------+-------------------+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.cafeconleche.org/books/bible2/ | | http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://www.cafeaulait.org/ | | Read Cafe con Leche for XML News: http://www.cafeconleche.org/ | +----------------------------------+---------------------------------+ |
From: Mikael S. <mik...@ho...> - 2002-05-09 21:29:41
|
At 08:55 2002-05-09 -0400, Elliotte Rusty Harold wrote: >How about a type-safe enum instead of either one; e.g. > >public class Error { > > private Error() {}; > > public static final Error FATAL_ERROR = new Error(); >etc... Please don't make things more complicated than nessecary. SAX is *Simple* API for XML. Please keep it that way. |
From: David B. <da...@pa...> - 2002-05-09 18:16:04
|
Given the choice between unique integers or objects, and the strings that are already in use, I prefer not to add a new model (stick with strings). It might make sense to use string constants though, so folk can do if (e.getSeverity () == e.warning) ... rather than if (e.getSeverity () == "warning") ... if they're concerned that they'll have typos, and don't want to actually test things. - Dave ----- Original Message ----- From: "Elliotte Rusty Harold" <el...@me...> To: <sax...@li...> Sent: Thursday, May 09, 2002 5:55 AM Subject: RE: [Sax-devel] SAX exception IDs > At 11:56 AM -0700 5/8/02, Yuval Oren wrote: > >I like the severity idea but think it would be better to use integer > >constants instead of strings: WARNING, ERROR, and FATAL_ERROR. That would be > >more consistent with the JDK (e.g. java.util.logging package) and would also > >prevent spelling mistakes. > > > > How about a type-safe enum instead of either one; e.g. > > public class Error { > > private Error() {}; > > public static final Error FATAL_ERROR = new Error(); > etc... > -- > > +-----------------------+------------------------+-------------------+ > | Elliotte Rusty Harold | el...@me... | Writer/Programmer | > +-----------------------+------------------------+-------------------+ |
From: Yuval O. <yu...@bl...> - 2002-05-09 21:47:04
|
I had forgotten about the prevalence of strings in the current API. Considering that, I agree that strings would be best. It would be nice to include string constants, though, along with the requirement that parsers either use the constants or internalize the strings they create. Yuval |
From: John C. <jc...@re...> - 2002-04-19 12:19:28
|
Elliotte Rusty Harold scripsit: > This does *not* say that [undeclared] attributes do have type CDATA, and I > can't find anything else in the spec that says they do. No, but CDATA means in effect "No restrictions, only the simplest kind of normalization" and that is what is done with undeclared attributes as well. > (The restriction to non-validating processors here is IMO an erratum in > the spec which I just reported to the editors.) Validating processors treat undeclared attributes as validity errors. -- John Cowan <jc...@re...> http://www.reutershealth.com I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_ |
From: Rob L. <ro...@el...> - 2002-04-19 13:27:33
|
John Cowan wrote: > > (The restriction to non-validating processors here is IMO an erratum in > > the spec which I just reported to the editors.) > > Validating processors treat undeclared attributes as validity errors. True, but that doesn't stop the attributes from still requiring normalization and being reported to the application. I agree with Elliotte, these seems to be an oversight in the recommendation. ~Rob |