From: Michael B. <mbn...@mb...> - 2003-07-09 21:57:14
|
> > Thanks for clarifying - I wasn't aware of NaN. So, according to > what you say, > > //div[number(nr)=number('')] > > should test for NaN in "nr" elements, but it returns the empty > nodeset. Again that's indeed the wrong result, given your your data, and it's probably another sign that the notation for the empty string isn't being handled properly in eXist's XPath implementation. But your paraphrase of what's being tested isn't quite accurate. You aren't "testing for NaN in 'nr' elements", because NaN is a result of evaluating an expression, not a value that an element could contain. Here, the right-hand side will (should) always evaluate to NaN, as will the l.h. side if the nodeset is empty or no string value of any of the nodes it contains is a number. > > //div[nr=NaN] > > does the same Not necessarily. The implicit application of number() to a nodeset in an equality expression is only invoked when the other operand is a number. NaN is one of a kind. Its internal representation is a bit pattern specified in IEEE 754 to stand for the result of calculations that can't be expressed as a number: it isn't itself a number (or a string ) so that predicate is most likely always false. [The XPath standard prescribes that all numbers (including integers) shall be represented according to IEEE 754, which is how NaN gets into XPath. The fact that many XPath implementations are written in Java, which also uses IEEE 754, is just co-incidence: XPath implementations written in languages that don't use IEEE 754 natively must still (appear to) do so when handling numbers in XPath.] > > By the > way - does XPath/eXist provide a way to distinguish between ><nr></nr> and <nr />? No, and nor does any XML-conformant application. It's a purely lexical difference which disappears during the parse. > > //div[nr and string(nr)=''] > > does return what I wanted Now that's interesting from the point of view of the eXist engine's treatment of the empty string, and that observation should help Wolfgang work out what's going wrong. In fact, both nr and string(nr) evaluate to the same thing when equated to a string (including the empty string), so that expression ought to behave identically to [nr=''] but from what you say it seems it doesn't do so in eXist. > while > //div[string(nr)=''] > > includes div 7 in the resultset. Apparently the result of converting a > non-existing element into a string is not "NaS" (not a string) or > "undefined" or something, but the empty string. So what does > "number(nr)" return for a non-existing <nr> element? NaN? There's no conversion of a non-existent element to a string here. Remember, we're talking nodesets. string(nr) operates on the nodeset which the nr portion of the exression returns. So if nr evaluates to the empty nodeset, then the rules say that the result is the empty string. So for div 7, the predicate compares the empty string to itself, with result true. But again it's interesting that here eXist seems to be interpreting the expresiion '' correctly. All this insistence on nodesets isn't mere pedantry. It's an issue that leads to a lot of grief and misunderstanding in the XPath world. Just one final example of what I mean. Let's truncate your example file to: <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> <test> <div id="7"></div> </test> Now consider the Xpath //div[nr = nr] applied to this file. Not very useful but perfectly legal. But what's the value of the expression in the predicate? Deep breath: the value is false. How come? Because where the operands on both sides of an XPath equality expression are node sets, the expression is true if, and only if, there is at least one node in each set that has the same string value as at least one node in the other set. So, if either or both of the node-sets are empty, the result is false. And while people are spluttering about that, how about, applied to the same data, //div[nr != nr] ? That one is false as well: an XPath expression with operator != and node sets as both operands evaluates to true if and only if there is at least one node in one set that has a different string value from at least one node in the other set. So again, if either set is empty, the expression evaluates to false (and there are a lot of cases in which != and = between two non-empty nodesets both return the same Boolean result, since the conditions for the truth of both = and != can hold good simultaneously). There are more somewhat counter-intuitive things to be said about operations on XPath nodesets, but I've probably taken up too much time and bandwidth already. If it's any consolation, readers of this list can now safely join me at the any conference bar without running the risk of me winning a beer off them with this little conundrum. Though I do have one or two others in reserve... Michael Beddow |