From: Clark C . E. <cc...@cl...> - 2002-03-08 14:42:08
|
On Fri, Mar 08, 2002 at 03:27:30AM -0500, Oren Ben-Kiki wrote: | Let's consider each YPATH to be a regexp pattern. The '/' would be an | operator which says "look for a child node matching the rest of the regexp". | This makes it easy to write things like | | /pat1(/pat2|/pat3/pat4)/pat5 Hmm. I think we could use [] as the existence test (like XPATH). /pat1[/pat2|/pat3/pat4]/pat5 Navigate to a root key "pat1" such that there exists either another root key "pat2" or a subtree "pat3" with child "pat4". Navigate to child "pat5" | /value | matches | --- value I was thinking that /xxx would only match on keys and not on values. If you want ot match on values, I think using the exists operator would be ideal. /[.='value'], perhaps abbreviated to ['value'] # /two matches "world" --- one: hello two: world ... | /12 matches | --- 12 Once again, I would assume that /12 matches the 13th item in a top level sequence. Since the item isn't a sequence, this pattern would not match ---12. | However, we need to allow regular expressions. So in YPATH there are more | characters which need to be quoted: /, *, etc. So: | | /val.*e | matches --- value: matches ... | | /"/value" | matches --- value: matches ... | /transfer(int) Close. This would be /[transfer("int")] since transfers belongs in the predicate. | instead? This has the advantage it makes quoting transfers easier, as in | | /transfer("http://...") Sure. We could even allow both forms since both forms are allowed after the ! no? | Question#2: Does | | /transfer(ip) | match | --- 1.2.3.4 I think so. If the engine/parser doesn't have the "ip" method registered, then transfer("ip") will be an error. | Question#3: Does | | /transfer(http:) | match | --- !int 12 No wildcard transfer methods. | Question #4: Given we have series/sequences, should we add regexp operators | >int, >=int, <int, <=int, !=int? Yes, but in the predicate [] | Question #5: Is this set ordered? Think of it... it is tricker | than it seems at a first glance. I was thinking the result of a YPath expression is a unordered stack of nodes, that I'll call a context. As for ordering... I think we can add some sort of "ORDER BY". Having the nodes ordered implicitly causes alot of work by the xpath processor that is often unnecessary in XML (and probably impossible in our case). | Question #6: How about prefixes? Do we allow this: | | /transfer("http://company.tld/whatever^type")/transfer("^another-type") This could work. | So, sticking with absolute paths, it seems as though parent() is well | defined. ancestor(), descendant() are also easy; "child()" is merely '*' in | a different guise: | | /*/b/parent() XPATH has the notion of "axis", thus each path segment is (axis '::')? node-expr -- I'm not sure if this fits us. Is it possible to use ".." for parent? and "." for current node? I know both of these are regex... however. | In a sequence, we need to specify next()/prev() entry in a series: | | /*/b/parent()/prev() | matches | --- | - this | - b | | Likewise we can define 'before()' and 'after()'. Hmm. Unlike XML our tree is not ordered (although it can be ordered by the keys). This requires some thought. | Question #7: Using any of these function() operators (except for transfer()) | means that the path is no longer "simple". This has implications in terms of | how easily it can be implemented, whether it can be used in a streaming | application, etc. I think we should formally define random-access vs. | streaming paths. Right. | | Question #8: There are many functions which can be cast in the form | <direction>(<distance>), such as: | | up(1) - parent() | up(>0) - ancestor() | up(>0) - ancestor-or-self | | Etc. We should probably offer the general form... does using '>' '>=' '<' | '<=' make sens to you in this context? Does this mean that: next(>3&<10) | should be allowed? next(>3)&next(<10) would be allowed anyway, I guess, but | is less clear I think. Hmm. | Question #9: Do we do it the Perl way ("there's more than one way"( or the | Python way ("there's the right way"). Example: do we allow /../ as a | shorthand for /up(1)/? How about /.../ as a shorthand for /down(>0)/, or | /..../ as a shorthand for /up(>0)/? Remember * is already a shorthand for | down(1)... I think ".." is fine, "..." and "...." tip the scale. | Question #10: obviously the above provides a natural syntax for them. | However, in a relative path, going up above the starting point isn't | well-defined in the graph model. So... how to we handle this? Define two | classes of relative paths, one which are safe in a graph model and oens that | aren't? The result of a YPath evaluation is a unordered set of contexts (a context is a stack of nodes, aka path). So, to evaluate a "relative" path, one must pass in a context, not just a node. Note that any node is a context with itself as the "top" node. | Question #11: How do we handle !include? Good question. #include is non-trivial (see XML Include) | Hmmm. YPATH isn't that simple after all, it seems. Neither is !include :-) Nope. YPATH is a 6 month project. And its best to define it as we implement it so that we can always play with it... The big thing which seems to be missing in your exposition above is the distinction between "navigation" and "predicate". Navigation functions return contexts, where predicate functions return boolean values. Predicates [] are filters. Best, Clark -- Clark C. Evans Axista, Inc. http://www.axista.com 800.926.5525 XCOLLA Collaborative Project Management Software |