Thread: Re: Re[5]: [cedet-semantic] Using Semantic to parse XPATH
Brought to you by:
zappo
From: David Ponce<dp...@vo...> - 2001-12-11 10:23:12
|
Hi Alex, I managed to get your small example to work. The problem was that DOUBLESLASH was defined as a start nonterminal because it was the first one defined in your grammar. Thus the grammar had two start nonterminals defined: DOUBLESLASH and LocationPath (because of the %start statement). In fact the %start statement defines extra alternate entry points in the grammar and the first nonterminal defined in the grammar is always the main entry point. I think I must make the doc clearer about this point ;) Attached you will find the fixed grammar and the very small example I used to parse "child::chapter". Also notice that I removed all "( $1 )" productions. $1 is the default production (like in Bison) when nothing is specified. Finally I think it is useful to know that productions in the LALR grammar do not (yet) obey to the same writing rules that are used for Semantic LL parser grammars. In LALR grammar productions do not use the OLE (Optional Lambda Expression) mechanism. For example in LL grammar you can write ( $1 ) as a valid production. But in LALR grammar you must write (identity $1). Hope this helps. Sincerely, David ____________________________________________________________ Faites un voeu et puis Voila ! www.voila.fr Avec Voila Mail, consultez vos e-mails sur votre mobile Wap. |
From: Alex S. <al...@gn...> - 2001-12-13 01:31:50
|
"David Ponce"<dp...@vo...> writes: > Attached you will find a BNF grammar I quickly hacked from the XPath > specification I found at <http://www.w3.org/TR/xpath>. It still needs > more work to be usable by Wisent but maybe it could be a starting point > to improve your grammar Thanks a lot! I used your grammar and got everything to work again. The following expressions are no longer a problem. (xpath-lex-string "child::para") (wisent-parse xpath-tables #'xpath-pop-input #'error) (xpath-lex-string "child::para/parent::*") (wisent-parse xpath-tables #'xpath-pop-input #'error) (xpath-lex-string "child::para/parent::text()") (wisent-parse xpath-tables #'xpath-pop-input #'error) While compiling the BNF file, I get one shift/reduce conflict. Do you think I should invest some energy in eliminating them? The bison manual says this is ok, but I'm not sure. I see at least two such potential conflicts in the BNF file I have right now: # [2] AbsoluteLocationPath : SLASH | SLASH RelativeLocationPath | AbbreviatedAbsoluteLocationPath ; and # [19] PathExpr : LocationPath | FilterExpr | FilterExpr SLASH RelativeLocationPath | FilterExpr SLASH SLASH RelativeLocationPath ; Is this correct? If so -- is wisent reporting a wrong number of conflicts? If not, what am I missing? Alex. -- http://www.emacswiki.org/ |
From: David Ponce<dp...@vo...> - 2001-12-13 07:30:33
|
Hi Alex, > Thanks a lot! I used your grammar and got everything to work again. Great! Maybe some part of your work could be used to provide Semantic facilities for an 'xpath-mode'? What do you think? Eric? > While compiling the BNF file, I get one shift/reduce conflict. Do you > think I should invest some energy in eliminating them? The bison > manual says this is ok, but I'm not sure. I see at least two such > potential conflicts in the BNF file I have right now: In most cases such conflicts can be safely ignored. But, please, could you send me a tarball of your BNF and EL files, so I could have a look at the conflict in your grammar? Thanks! David ____________________________________________________________ Faites un voeu et puis Voila ! www.voila.fr Avec Voila Mail, consultez vos e-mails sur votre mobile Wap. |
From: Eric M. L. <er...@si...> - 2001-12-13 12:35:57
|
>>> "David Ponce"<dp...@vo...> seems to think that: >Hi Alex, > >> Thanks a lot! I used your grammar and got everything to work again. > >Great! Maybe some part of your work could be used to provide Semantic >facilities for an 'xpath-mode'? What do you think? Eric? [ ... ] That certainly makes sense. An xpath mode to bind the grammar against would be a good thing, and the editing and completion help such a parser can provide would be useful to all xpath authors. Eric -- Eric Ludlam: za...@gn..., er...@si... Home: www.ultranet.com/~zappo Siege: www.siege-engine.com Emacs: http://cedet.sourceforge.net GNU: www.gnu.org |
From: Alex S. <al...@gn...> - 2001-12-15 14:20:59
|
"Eric M. Ludlam" <er...@si...> writes: >>Great! Maybe some part of your work could be used to provide Semantic >>facilities for an 'xpath-mode'? What do you think? Eric? > [ ... ] > > That certainly makes sense. An xpath mode to bind the grammar against > would be a good thing, and the editing and completion help such a > parser can provide would be useful to all xpath authors. Hm, I still don't understand how this will work. There are no and there probably never will be any XPATH files. XPATH is something like an URL. There is no URL mode. URLs appear in HTML files, though. XPATHs appear in XML (and XSL, which is also XML) files. Thus, the major mode of such files will be xml-mode, sgml-mode, html-mode, xhtml-mode, or something along these lines. If you really want to support XPATH in XML documents, for example, it seems to me that you need to add the XPATH BNF to a more general XSL BNF, and redo the entire thing. Since the two of you are so enthusiastic, however, I get the feeling that I am missing something. What is it? Some hidden feature of Semantic which allows you to mix and match major modes? Something even more powerful? :) Alex. -- http://www.emacswiki.org/ |
From: Alex S. <al...@gn...> - 2001-12-15 14:26:05
|
Ok, I sent a tarball to David, and I upload the remaining files to the Emacs Wiki. The files, and some discussion of how to use them, is available at http://www.emacswiki.org/cgi-bin/wiki.pl?XmlParser Alex. -- http://www.emacswiki.org/ |
From: Alex S. <al...@gn...> - 2001-12-16 23:31:36
|
If anybody is following the XPATH stuff I'm doing, here are the latest changes. I think from here on it will be much easier. :) This currently accepts equality expressions as predicates. Here are two examples from the test suite: child::*[position()=2] self::*[attribute::id=\"compiler\"] Not-equal and other operators should be pretty easy. I think I'll let it rest soon enough and implement new features on request only. I'll probably turn my attention to XSLT, next. Alex. 2001-12-17 Alex Schroeder <ken...@ya...> * xpath.bnf (FunctionCall): Use it. (PathExpr): Use it. * xpath-parser.el (xpath-literal-regexp): New regexp. (xpath-lex-region): Return literals. Plus more tests. * xpath.el (xpath-name-filter): Accept wildcards. (xpath-context-node): Renamed from context-node. (xpath-context-size): Renamed from context-size. (xpath-context-position): Renamed from context-position. (xpath-last-function): Use renamed. (xpath-position-function): Use renamed. (xpath-name-function): Use renamed. (xpath-equal): Rewrite. (xpath-number): New function. (xpath-string): New function. (xpath-eval): New function. (xpath-resolve-steps): Rewrite of xpath-resolve. (xpath-resolve): Use it. Plus new test. Available at http://www.emacswiki.org/cgi-bin/wiki.pl?XmlParser -- http://www.emacswiki.org/ |
From: Eric M. L. <er...@si...> - 2001-12-17 17:02:49
|
Hi, Sorry, I just didn't know what XPATH was for. Letting semantic parse a string w/out a major sounds like a useful feature in general for this type of task. Eric >>> Alex Schroeder <al...@gn...> seems to think that: >"Eric M. Ludlam" <er...@si...> writes: > >>>Great! Maybe some part of your work could be used to provide Semantic >>>facilities for an 'xpath-mode'? What do you think? Eric? >> [ ... ] >> >> That certainly makes sense. An xpath mode to bind the grammar against >> would be a good thing, and the editing and completion help such a >> parser can provide would be useful to all xpath authors. > >Hm, I still don't understand how this will work. There are no and >there probably never will be any XPATH files. XPATH is something like >an URL. There is no URL mode. URLs appear in HTML files, though. >XPATHs appear in XML (and XSL, which is also XML) files. Thus, the >major mode of such files will be xml-mode, sgml-mode, html-mode, >xhtml-mode, or something along these lines. If you really want to >support XPATH in XML documents, for example, it seems to me that you >need to add the XPATH BNF to a more general XSL BNF, and redo the >entire thing. Since the two of you are so enthusiastic, however, I >get the feeling that I am missing something. What is it? Some hidden >feature of Semantic which allows you to mix and match major modes? >Something even more powerful? :) [ ... ] |
From: David Ponce<dp...@vo...> - 2001-12-17 16:22:10
|
Hi Alex, > While compiling the BNF file, I get one shift/reduce conflict. Do you > think I should invest some energy in eliminating them? The bison > manual says this is ok, but I'm not sure. I see at least two such > potential conflicts in the BNF file I have right now: I had a look at your xpath.bnf and made the follwing changes to fix the shift/reduce conflict. In fact the rule: NodeTest -> NodeType LPAREN Arglist RPAREN conflicted on LPAREN with rule: NodeTest -> PROCESSING-INSTRUCTION LPAREN LITERAL RPAREN because NodeType also contained the rule NodeType -> PROCESSING-INSTRUCTION So I rewrote the rules like this: NodeTest : NameTest (list 'xpath-name-filter $1) | NodeType LPAREN RPAREN (list 'xpath-node-type-filter $1) | PROCESSING-INSTRUCTION LPAREN RPAREN (list 'xpath-node-type-filter $1) | PROCESSING-INSTRUCTION LPAREN LITERAL RPAREN ; NodeType : COMMENT | TEXT | NODE ; Notice that I also removed Arglist from NodeTest rules because the specification says: [7] NodeTest ::=3D NameTest | NodeType '(' ')' | 'processing-instruction' '(' Literal ')' Attached you will find the BNF file I modified. Could you try it and tell me if it is OK? Sincerely, David ____________________________________________________________ Faites un voeu et puis Voila ! www.voila.fr Avec Voila Mail, consultez vos e-mails sur votre mobile Wap. |
From: Alex S. <al...@gn...> - 2001-12-12 00:13:28
|
Hi David, I rewrote my BNF file following your suggestions, and I rewrote the lexer I had. It is much simpler than the previous version I had which was more or less copied from wisent-java.el. I also used semantic-flex-syntax-modifications, which Eric suggested. It seems to work! Thanks again. Next problem, however. :) I can parse things like these child::para child::para/parent::* But in the following case, the "()" need some special treatment. What can I do? Note that the brackets will be used, later, for other things as well. My first idea was to set the brackets to symbol syntax class -- thus they would be part of the name. But I think this will cause problems later. child::para/parent::text() One example of planned use of brackets: child::para[position()=last()-1] id("foo")/child::para[position()=5] child::*[substring("12345", -42, 1 div 0)] Anyway, this is what I currently have after flexing: (xpath-lex-string "child::para/parent::text()") ((CHILD-AXIS "child" 1 . 6) (COLON ":" 6 . 7) (COLON ":" 7 . 8) (NAME "para" 8 . 12) (SLASH "/" 12 . 13) (PARENT-AXIS "parent" 13 . 19) (COLON ":" 19 . 20) (COLON ":" 20 . 21) (TEXT-TEST "text" 21 . 25) (nil "()" 25 . 27)) I'm not even sure what I want, here... Ideas? (or questions?) Alex. ;;; xpath-parser.el --- XPATH parser ;; Copyright (C) 2001 Alex Schroeder <al...@gn...> ;; Author: Alex Schroeder <al...@gn...> ;; Maintainer: Alex Schroeder <al...@gn...> ;; Keywords: xml ;; URL: http://www.emacswiki.org/cgi-bin/wiki.pl?XmlParser ;; Version: $Id: xpath-parser.el,v 1.2 2001/12/12 00:07:50 alex Exp alex $ ;; This file is not part of GNU Emacs. ;; This is free software; you can redistribute it and/or modify it under ;; the terms of the GNU General Public License as published by the Free ;; Software Foundation; either version 2, or (at your option) any later ;; version. ;; This is distributed in the hope that it will be useful, ;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;; GNU General Public License for more details. ;; You should have received a copy of the GNU General Public License ;; along with GNU Emacs; see the file COPYING. If not, write to the ;; Free Software Foundation, Inc., 59 Temple Place - Suite 330, ;; Boston, MA 02111-1307, USA. ;;; Commentary: ;; Used by xpath.el, tables created automatically from xpath.bnf. (require 'wisent-bovine) (defvar xpath-tables (eval-when-compile (wisent-compile-grammar '((SLASH COLON LPAREN RPAREN LBRACK RBRACK DOT AT WILDCARD ANCESTOR-AXIS ANCESTOR-OR-SELF-AXIS ATTRIBUTE-AXIS CHILD-AXIS DESCENDANT-AXIS DESCENDANT-OR-SELF-AXIS FOLLOWING-AXIS FOLLOWING-SIBLING-AXIS NAMESPACE-AXIS PARENT-AXIS PRECEDING-AXIS PRECEDING-SIBLING-AXIS SELF-AXIS NODE-TEST TEXT-TEST NAME) nil (LocationPath ((RelativeLocationPath))) (DOUBLESLASH ((SLASH SLASH) nil)) (DOUBLECOLON ((COLON COLON) nil)) (DOUBLEDOT ((DOT DOT) nil)) (RelativeLocationPath ((Step)) ((RelativeLocationPath SLASH Step) (list $1 $3))) (Step ((AxisSpecifier NodeTest) (list $1 $2))) (AxisSpecifier ((AxisName DOUBLECOLON) (list 'xpath-add-axis $1))) (AxisName ((ANCESTOR-AXIS)) ((ANCESTOR-OR-SELF-AXIS)) ((ATTRIBUTE-AXIS)) ((CHILD-AXIS)) ((DESCENDANT-AXIS)) ((DESCENDANT-OR-SELF-AXIS)) ((FOLLOWING-AXIS)) ((FOLLOWING-SIBLING-AXIS)) ((NAMESPACE-AXIS)) ((PARENT-AXIS)) ((PRECEDING-AXIS)) ((PRECEDING-SIBLING-AXIS)) ((SELF-AXIS))) (NodeTest ((WILDCARD) (list 'xpath-add-test $1)) ((NAME) (list 'xpath-add-test $1)) ((NodeType LPAREN RPAREN) (list 'xpath-add-test (concat $1 "()")))) (NodeType ((TEXT-TEST)) ((NODE-TEST)))) 'nil)) "Table for use with semantic for parsing XPATH.") (defvar xpath-keywords (semantic-flex-make-keyword-table `( ("ancestor" . ANCESTOR-AXIS) ("ancestor-or-self" . ANCESTOR-OR-SELF-AXIS) ("attribute" . ATTRIBUTE-AXIS) ("child" . CHILD-AXIS) ("descendant" . DESCENDANT-AXIS) ("descendant-or-self" . DESCENDANT-OR-SELF-AXIS) ("following" . FOLLOWING-AXIS) ("following-sibling" . FOLLOWING-SIBLING-AXIS) ("namespace" . NAMESPACE-AXIS) ("parent" . PARENT-AXIS) ("preceding" . PRECEDING-AXIS) ("preceding-sibling" . PRECEDING-SIBLING-AXIS) ("self" . SELF-AXIS) ("node" . NODE-TEST) ("text" . TEXT-TEST) ) '( )) "Table for use with semantic for XPATH keywords.") (defvar xpath-tokens '((literal (NAME . "")) (close-paren (RBRACK . "]") (RPAREN . ")")) (open-paren (LBRACK . "[") (LPAREN . "(")) (punctuation (WILDCARD . "*") (AT . "@") (DOT . "..") (COLON . ":") (SLASH . "/"))) "Table for use with semantic for tokens.") (defun xpath-default-setup () "XPATH parsing setup function." ;; Code generated from xpath.bnf (setq semantic-toplevel-bovine-table xpath-tables semantic-toplevel-bovine-table-source "xpath.bnf") (setq semantic-flex-keywords-obarray xpath-keywords) ;; End code generated from xpath.bnf (setq semantic-flex-syntax-modifications '((?/ ".") (?* ".")))) ;;; Lexer (defvar xpath-token-input nil "The parsed XPATH tokens created by `xpath-lex-region'. The elements in this list are returned one by one using `xpath-pop-input'.") (defun xpath-pop-input () "Pop an element from `xpath-token-input'. If the list is empty, return `wisent-eoi-term' in a list." (or (pop xpath-token-input) (list wisent-eoi-term))) (defun xpath-lex-region (start end) "Lex the region for XPATH. This calls `semantic-flex' on the region, munges the result, and stores a list of tokens in `xpath-token-input'. The tokens will be available via `xpath-pop-input' and are suitable for `wisent-parse' consumption." (xpath-default-setup) (let ((objs (semantic-flex start end)) token category keyword text result) (dolist (obj objs) (setq category (car obj) pos (cdr obj) text (semantic-flex-text obj) key (or (semantic-flex-keyword-p text) (semantic-flex-token-key xpath-tokens category text))) (when (and (not key) (string-match "^[a-zA-Z_][a-zA-Z_0-9.-_]*$" text)) (setq key 'NAME)) (setq result (cons (append (list key text) pos) result))) (setq xpath-token-input (nreverse result)))) (defun xpath-lex-string (str) "Lex the string STR for XPATH. This uses `xpath-lex-region', which see." (with-temp-buffer (insert str) (xpath-lex-region (point-min) (point-max)))) ;;; Test stuff (eval-when-compile (semantic-flex-keyword-p "child") (semantic-flex-keyword-p ":") (semantic-flex-token-key xpath-tokens 'punctuation ":") (xpath-lex-string "child::para") (xpath-lex-string "child::para/parent::*") ; (xpath-lex-string "child::para/parent::text()") (wisent-parse xpath-tables #'xpath-pop-input #'error)) ;;; xpath-parser.el ends here |