Hi Eric,
[...]
>>Thanks Eric! I saw that you converted the scheme grammar too. Great!
>>I think I am going to merge semantic-java.el and wisent-java-tags.el,
>>and remove the old java.bnf file.
>>
>>So after that it will not be more necessary to continue support of
>>BNF style=3F What do you think=3F
>
> [ ... ]
>
> The scheme file doesn't work for me yet. I'm not yet sure why.
> I also converted the skeleton for the .by format. I haven not yet
> converted erlang, nor do I have an erlang example yet.
>
> Because the scheme parser was giving me a hard time, I started working
> on a debugger interface. I'm familiar with the bovine parser
> (naturally), but could you tell me what sorts of debug information you
> might having at "step" locations during parsing so I can my interface
> compatible.
IMO it will be hard to have a common debugger interface between the
bovinator and wisent. The difficulty with wisent parser is that it is
based on states and what to do at each state (shift, reduce, accept
or error). For example when the parser shift the next terminal at a
given state it don't know yet what rule is evaluated, it will just know
the next state based on the token it read.
However when shifting it should be possible to highlight the token
read by providing a new "edebug" lexer that will dynamically replace
the normal `wisent-lex' lexer when debugging is enabled. That new
lexer could look like this:
(define-wisent-lexer wisent-edebug-lex
"Return the next available lexical token in Wisent's form.
The variable `wisent-lex-istream' contains the list of lexical tokens
produced by `semantic-lex'. Pop the next token available and convert
it to a form suitable for the Wisent's parser."
(let* ((tk (car wisent-lex-istream)))
;; Eat input stream
(setq wisent-lex-istream (cdr wisent-lex-istream))
;; Highlight token in source buffer
(DO-SOMETHING-TO-HIGHLIGHT-SOURCE tk)
;; Return a wisent lexical token
(cons (semantic-lex-token-class tk)
(cons (semantic-lex-token-text tk)
(semantic-lex-token-bounds tk)))))
I also made the following patch to wisent-comp.el so each semantic
action has access to a new internal variable `$action' that contains a
pair (NTERM . I) where NTERM is the symbol of the nonterminal the
action belongs to, and I is the index of the semantic action inside
the nonterminal definition. For example in this definition:
nonterm: any rule items
(something-to-do-1)
| other things
(something-to-do-2)
;
`$action' value will be '(nonterm . 0) and '(nonterm . 1) in
respectively the semantic actions `(something-to-do-1)' and
`(something-to-do-2)'.
For this definition with a middle-rule action:
nonterm: any (something-to-do-0) rule items
(something-to-do-1)
| other things
(something-to-do-2)
;
`$action' value will be '(nonterm . 0), '(nonterm . 1) and '(nonterm .
2) in respectively the semantic actions `(something-to-do-0),
`(something-to-do-1)' and `(something-to-do-2)'.
Also each semantic action has access to the internal variable `$nterm'
which gives the "actual" symbol of the nonterminal the action belongs
to. Contrary to `$action' the `$nterm' value can be a '@n' generated
symbol when in a mid-rule action.
Maybe all that could be useful for debugging=3F
David
P.S.: The patch also include a lot of whitespace fixes ;-)
Index: wisent-comp.el
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/cedet/cedet/semantic/wisent/wisent-comp.el,v
retrieving revision 1.17
diff -c -r1.17 wisent-comp.el
*** wisent-comp.el=0913 Aug 2002 16:46:11 -0000=091.17
--- wisent-comp.el=093 Feb 2003 13:38:47 -0000
***************
*** 39,45 ****
;; For more details on Wisent itself read the Wisent manual.
;;; History:
! ;;
;;; Code:
(require 'wisent)
--- 39,45 ----
;; For more details on Wisent itself read the Wisent manual.
;;; History:
! ;;
;;; Code:
(require 'wisent)
***************
*** 68,78 ****
`(if (and ,name (symbolp ,name))
(intern (format "wisent-context-%s" ,name))
(error "invalid context name: %S" ,name)))
!
(defmacro wisent-context-bindings (name)
"Return the variables in context NAME."
`(symbol-value (wisent-context-name ,name)))
!
(defmacro wisent-defcontext (name &rest vars)
"Define a context NAME that will bind variables VARS."
(let* ((context (wisent-context-name name))
--- 68,78 ----
`(if (and ,name (symbolp ,name))
(intern (format "wisent-context-%s" ,name))
(error "invalid context name: %S" ,name)))
!
(defmacro wisent-context-bindings (name)
"Return the variables in context NAME."
`(symbol-value (wisent-context-name ,name)))
!
(defmacro wisent-defcontext (name &rest vars)
"Define a context NAME that will bind variables VARS."
(let* ((context (wisent-context-name name))
***************
*** 345,351 ****
(=3F\d . "'\\d'") ; delete character, DEL
)
"Printed representation of usual escape sequences.")
!
(defsubst wisent-tag (s)
"Return printable form of item number S."
(let ((tag (aref tags s)))
--- 345,351 ----
(=3F\d . "'\\d'") ; delete character, DEL
)
"Printed representation of usual escape sequences.")
!
(defsubst wisent-tag (s)
"Return printable form of item number S."
(let ((tag (aref tags s)))
***************
*** 558,564 ****
(while (not break)
(setq i (1- n))
(while (natnump i)
! =09;; Np[i] =3D N[i]
(aset Np i (aref N i))
(setq i (1- i)))
--- 558,564 ----
(while (not break)
(setq i (1- n))
(while (natnump i)
! ;; Np[i] =3D N[i]
(aset Np i (aref N i))
(setq i (1- i)))
***************
*** 585,597 ****
;; reachable symbols, add the production to the set of reachable
;; productions, and add all of the nonterminals in the RHS of the
;; production to the set of reachable symbols.
!
;; Consider only the (partially) reduced grammar which has only
;; nonterminals in N and productions in P.
!
;; The result is the set P of productions in the reduced grammar,
;; and the set V of symbols in the reduced grammar.
!
;; Although this algorithm also computes the set of terminals which
;; are reachable, no terminal will be deleted from the grammar. Some
;; terminals might not be in the grammar but might be generated by
--- 585,597 ----
;; reachable symbols, add the production to the set of reachable
;; productions, and add all of the nonterminals in the RHS of the
;; production to the set of reachable symbols.
!
;; Consider only the (partially) reduced grammar which has only
;; nonterminals in N and productions in P.
!
;; The result is the set P of productions in the reduced grammar,
;; and the set V of symbols in the reduced grammar.
!
;; Although this algorithm also computes the set of terminals which
;; are reachable, no terminal will be deleted from the grammar. Some
;; terminals might not be in the grammar but might be generated by
***************
*** 645,651 ****
(if (wisent-BITISSET V i)
(setq nuseless-nonterminals (1- nuseless-nonterminals)))
(setq i (1+ i)))
!
;; A token that was used in %prec should not be warned about.
(setq i 1)
(while (<=3D i nrules)
--- 645,651 ----
(if (wisent-BITISSET V i)
(setq nuseless-nonterminals (1- nuseless-nonterminals)))
(setq i (1+ i)))
!
;; A token that was used in %prec should not be warned about.
(setq i 1)
(while (<=3D i nrules)
***************
*** 733,739 ****
V1 (make-vector (wisent-WORDSIZE nsyms) 0)
nuseless-nonterminals 0
nuseless-productions 0)
!
(wisent-useless-nonterminals)
(wisent-inaccessable-symbols)
--- 733,739 ----
V1 (make-vector (wisent-WORDSIZE nsyms) 0)
nuseless-nonterminals 0
nuseless-productions 0)
!
(wisent-useless-nonterminals)
(wisent-inaccessable-symbols)
***************
*** 812,818 ****
(while (< i nsyms)
(setq q nil
p (aref dset (- i ntokens))) ;; p =3D dset[i]
!
(while p
(setq p (aref delts p)
q (cons (car p) q) ;;q++ =3D p->value
--- 812,818 ----
(while (< i nsyms)
(setq q nil
p (aref dset (- i ntokens))) ;; p =3D dset[i]
!
(while p
(setq p (aref delts p)
q (cons (car p) q) ;;q++ =3D p->value
***************
*** 921,927 ****
(setq rp (aref fderives (- i ntokens))
j 0)
(while (<=3D j nrules)
! =09(if (wisent-BITISSET rp j)
(wisent-log " %d\n" j))
(setq j (1+ j)))
(setq i (1+ i)))))
--- 921,927 ----
(setq rp (aref fderives (- i ntokens))
j 0)
(while (<=3D j nrules)
! (if (wisent-BITISSET rp j)
(wisent-log " %d\n" j))
(setq j (1+ j)))
(setq i (1+ i)))))
***************
*** 1256,1262 ****
(if (not found)
(if (core-link sp)
(setq sp (core-link sp))
! =09=09 ;; sp =3D sp->link =3D new-state(symbol)
(setq sp (set-core-link sp (wisent-new-state symbol))=
found t)))))
;; bucket is empty
--- 1256,1262 ----
(if (not found)
(if (core-link sp)
(setq sp (core-link sp))
! ;; sp =3D sp->link =3D new-state(symbol)
(setq sp (set-core-link sp (wisent-new-state symbol))=
found t)))))
;; bucket is empty
***************
*** 1868,1874 ****
(setq nedges (1+ nedges)))
(setq j (1+ j)))
! =09(when (> nedges 0)
;; reads[i] =3D rp =3D NEW2(nedges + 1, short);
(setq rp (make-vector (1+ nedges) 0)
j 0)
--- 1868,1874 ----
(setq nedges (1+ nedges)))
(setq j (1+ j)))
! (when (> nedges 0)
;; reads[i] =3D rp =3D NEW2(nedges + 1, short);
(setq rp (make-vector (1+ nedges) 0)
j 0)
***************
*** 2325,2338 ****
(if (> src-count 0)
(wisent-log " %d shift/reduce conflict%s"
src-count (if (> src-count 1) "s" "")))
!
(if (and (> src-count 0) (> rrc-count 0))
(wisent-log " and"))
(if (> rrc-count 0)
(wisent-log " %d reduce/reduce conflict%s"
rrc-count (if (> rrc-count 1) "s" "")))
!
(wisent-log ".\n")))
(setq i (1+ i)))
(wisent-total-conflicts)))
--- 2325,2338 ----
(if (> src-count 0)
(wisent-log " %d shift/reduce conflict%s"
src-count (if (> src-count 1) "s" "")))
!
(if (and (> src-count 0) (> rrc-count 0))
(wisent-log " and"))
(if (> rrc-count 0)
(wisent-log " %d reduce/reduce conflict%s"
rrc-count (if (> rrc-count 1) "s" "")))
!
(wisent-log ".\n")))
(setq i (1+ i)))
(wisent-total-conflicts)))
***************
*** 2467,2473 ****
(aset lookaheadset k (logand (aref v k)
(aref shiftset k)))
(setq k (1+ k)))
!
(setq i 0)
(while (< i ntokens)
(if (wisent-BITISSET lookaheadset i)
--- 2467,2473 ----
(aset lookaheadset k (logand (aref v k)
(aref shiftset k)))
(setq k (1+ k)))
!
(setq i 0)
(while (< i ntokens)
(if (wisent-BITISSET lookaheadset i)
***************
*** 2510,2516 ****
(aref lookaheadset k)))
(setq k (1+ k)))
(setq i (1+ i))))
!
(fillarray shiftset 0)
(when shiftp
--- 2510,2516 ----
(aref lookaheadset k)))
(setq k (1+ k)))
(setq i (1+ i))))
!
(fillarray shiftset 0)
(when shiftp
***************
*** 2524,2530 ****
(setq i k) ;; break
(wisent-SETBIT shiftset symbol)))
(setq i (1+ i))))
!
(setq i 0)
(while (< i ntokens)
(setq defaulted nil
--- 2524,2530 ----
(setq i k) ;; break
(wisent-SETBIT shiftset symbol)))
(setq i (1+ i))))
!
(setq i 0)
(while (< i ntokens)
(setq defaulted nil
***************
*** 2553,2559 ****
(wisent-tag (aref rlhs (aref LAruleno j))))))
(setq j (1+ j)))
(setq i (1+ i)))
!
(if (>=3D default-LA 0)
(wisent-log
" $default\treduce using rule %d (%s)\n"
--- 2553,2559 ----
(wisent-tag (aref rlhs (aref LAruleno j))))))
(setq j (1+ j)))
(setq i (1+ i)))
!
(if (>=3D default-LA 0)
(wisent-log
" $default\treduce using rule %d (%s)\n"
***************
*** 2602,2608 ****
(setq j (1+ j)))
(if (> j 0)
(wisent-log "\n")))
!
(cond
((and (aref consistent state) redp)
(setq rule (aref (reductions-rules redp) 0)
--- 2602,2608 ----
(setq j (1+ j)))
(if (> j 0)
(wisent-log "\n")))
!
(cond
((and (aref consistent state) redp)
(setq rule (aref (reductions-rules redp) 0)
***************
*** 2613,2619 ****
(redp
(wisent-print-reductions state)
))
!
(when (< i k)
(setq v (shifts-shifts shiftp))
(while (< i k)
--- 2613,2619 ----
(redp
(wisent-print-reductions state)
))
!
(when (< i k)
(setq v (shifts-shifts shiftp))
(while (< i k)
***************
*** 2671,2677 ****
"Print information on generated parser.
Report detailed informations if `wisent-verbose-flag' or
`wisent-debug-flag' are non-nil."
! (when (or wisent-verbose-flag wisent-debug-flag)
(wisent-print-useless))
(wisent-print-conflicts)
(when (or wisent-verbose-flag wisent-debug-flag)
--- 2671,2677 ----
"Print information on generated parser.
Report detailed informations if `wisent-verbose-flag' or
`wisent-debug-flag' are non-nil."
! (when (or wisent-verbose-flag wisent-debug-flag)
(wisent-print-useless))
(wisent-print-conflicts)
(when (or wisent-verbose-flag wisent-debug-flag)
***************
*** 2702,2715 ****
shift-state symbol redp shiftp errp nodefault)
(fillarray actrow nil)
!
(setq default-rule 0
nodefault nil ;; nil inhibit having any default reduction
nreds 0
m 0
n 0
redp (aref reduction-table state))
!
(when redp
(setq nreds (reductions-nreds redp))
(when (>=3D nreds 1)
--- 2702,2715 ----
shift-state symbol redp shiftp errp nodefault)
(fillarray actrow nil)
!
(setq default-rule 0
nodefault nil ;; nil inhibit having any default reduction
nreds 0
m 0
n 0
redp (aref reduction-table state))
!
(when redp
(setq nreds (reductions-nreds redp))
(when (>=3D nreds 1)
***************
*** 2928,2942 ****
;; These variables only exist locally in the function
;; `wisent-semantic-actions' and are shared by all other nested
! ;; callees. They contain uninterned symbols used in code generation.
(wisent-defcontext semantic-actions
stack sp gotos state)
(defun wisent-semantic-action (r)
"Set up the Elisp function for semantic action at rule R.
! On entry RCODE[R] contains a pair (BODY . N) where BODY is the body of
! the semantic action and N is the maximum number of values available in
! the parser's stack. This replace RCODE[R] by a function of three
arguments:
- the state/value stack
--- 2928,2945 ----
;; These variables only exist locally in the function
;; `wisent-semantic-actions' and are shared by all other nested
! ;; callees.
(wisent-defcontext semantic-actions
+ ;; Uninterned symbols used in code generation.
stack sp gotos state)
(defun wisent-semantic-action (r)
"Set up the Elisp function for semantic action at rule R.
! On entry RCODE[R] contains a vector [BODY N (NTERM . I)] where BODY is =
the
! body of the semantic action, N is the maximum number of values
! available in the parser's stack, NTERM is the nonterminal the semantic
! action belongs to, and I is the index of the semantic action inside
! NTERM definition. This replace RCODE[R] by a function of three
arguments:
- the state/value stack
***************
*** 2946,2954 ****
that returns the updated top-of-stack index."
(if (not (aref ruseful r))
(aset rcode r nil)
! (let* ((n (cdr (aref rcode r))) ; nb of val avail. in stack
(body (wisent-semantic-action-expand-body
! (car (aref rcode r)) n))
($l (car body)) ; list of $I found in body
(body (cdr body)) ; expanded form of body
(nt (aref rlhs r)) ; nonterminal item no.
--- 2949,2959 ----
that returns the updated top-of-stack index."
(if (not (aref ruseful r))
(aset rcode r nil)
! (let* ((actn (aref rcode r))
! (n (aref actn 1)) ; nb of val avail. in stack
! (def (aref actn 2)) ; (NTERM . I)
(body (wisent-semantic-action-expand-body
! (aref actn 0) n))
($l (car body)) ; list of $I found in body
(body (cdr body)) ; expanded form of body
(nt (aref rlhs r)) ; nonterminal item no.
***************
*** 2991,2996 ****
--- 2996,3002 ----
(let* (,@bl
($region (wisent-region ,@rl))
($nterm ',(aref tags nt))
+ ($action ',def) ; action location
(,sp (- ,sp ,(* rhl 2)))
(,state (aref ,stack ,sp)))
;; push semantic value
***************
*** 3151,3157 ****
(setq def (car defs)
defs (cdr defs)
nonterm (car def)
! rlist (cdr def))
(or (consp rlist)
(error "invalid nonterminal definition syntax: %S" def))
(while rlist
--- 3157,3164 ----
(setq def (car defs)
defs (cdr defs)
nonterm (car def)
! rlist (cdr def)
! iactn 0)
(or (consp rlist)
(error "invalid nonterminal definition syntax: %S" def))
(while rlist
***************
*** 3161,3167 ****
rest (cdr rule)
rhl 0
rhs nil)
!
;; Check & count items
(setq nitems (1+ nitems)) ;; LHS item
(while items
--- 3168,3174 ----
rest (cdr rule)
rhl 0
rhs nil)
!
;; Check & count items
(setq nitems (1+ nitems)) ;; LHS item
(while items
***************
*** 3175,3184 ****
@n (intern (format "@%d" @count)))
(wisent-push-var @n t)
;; Push a new empty rule with the mid-rule action
! (setq semact (cons item rhl)
plevel nil
! rcode (cons semact rcode)
! rprec (cons plevel rprec)
item @n ;; Replace action by @N nonterminal
rules (cons (list item) rules)
nitems (1+ nitems)
--- 3182,3192 ----
@n (intern (format "@%d" @count)))
(wisent-push-var @n t)
;; Push a new empty rule with the mid-rule action
! (setq semact (vector item rhl (cons nonterm iactn))
! iactn (1+ iactn)
plevel nil
! rcode (cons semact rcode)
! rprec (cons plevel rprec)
item @n ;; Replace action by @N nonterminal
rules (cons (list item) rules)
nitems (1+ nitems)
***************
*** 3192,3198 ****
item))))
(setq rhl (1+ rhl)
rhs (cons item rhs)))
!
;; Check & collect rule precedence level
(setq plevel (when (vectorp (car rest))
(setq item (car rest)
--- 3200,3206 ----
item))))
(setq rhl (1+ rhl)
rhs (cons item rhs)))
!
;; Check & collect rule precedence level
(setq plevel (when (vectorp (car rest))
(setq item (car rest)
***************
*** 3203,3211 ****
(wisent-item-number (aref item 0))
(error "invalid rule precedence level syntax: =
%S" item)))
rprec (cons plevel rprec))
!
;; Check & collect semantic action body
! (setq semact (cons
(if rest
(if (cdr rest)
(error "invalid semantic action syntax: %=
S" rest)
--- 3211,3219 ----
(wisent-item-number (aref item 0))
(error "invalid rule precedence level syntax: =
%S" item)))
rprec (cons plevel rprec))
!
;; Check & collect semantic action body
! (setq semact (vector
(if rest
(if (cdr rest)
(error "invalid semantic action syntax: %=
S" rest)
***************
*** 3214,3221 ****
;; for an empty rule or $1, the value of the
;; first symbol in the rule, otherwise.
(if (> rhl 0) '$1 '()))
! rhl)
! rcode (cons semact rcode))
(setq rules (cons (cons nonterm (nreverse rhs)) rules)
nrules (1+ nrules))))
--- 3222,3231 ----
;; for an empty rule or $1, the value of the
;; first symbol in the rule, otherwise.
(if (> rhl 0) '$1 '()))
! rhl
! (cons nonterm iactn))
! iactn (1+ iactn)
! rcode (cons semact rcode))
(setq rules (cons (cons nonterm (nreverse rhs)) rules)
nrules (1+ nrules))))
***************
*** 3310,3316 ****
(setq start-var (car start-list))
(or (assq start-var defs)
(error "start symbol `%s' has no rule" start-var)))
!
;; 3. START-LIST contains more than one element. All defines
;; potential start symbols. One of them (the first one by
;; default) will be given at parse time to be the parser goal.
--- 3320,3326 ----
(setq start-var (car start-list))
(or (assq start-var defs)
(error "start symbol `%s' has no rule" start-var)))
!
;; 3. START-LIST contains more than one element. All defines
;; potential start symbols. One of them (the first one by
;; default) will be given at parse time to be the parser goal.
***************
*** 3318,3335 ****
;; disabled and the first nonterminal in START-LIST defines
;; the start symbol, like in case 2 above.
((not wisent-single-start-flag)
!
;; START-LIST is a list of nonterminals '(nt0 ... ntN).
;; Build and push ad hoc start rules in the grammar:
!
;; ($STARTS ((nt0) $1) ((nt1) $1) ... ((ntN) $1))
;; ($nt1 (($$nt1 nt1) $2))
;; ...
;; ($ntN (($$ntN ntN) $2))
!
;; Where internal symbols $ntI and $$ntI are respectively
;; nonterminals and terminals.
!
;; The internal start symbol $STARTS is used to build the
;; LALR(1) automaton. The true default start symbol used by the
;; parser is the first nonterminal in START-LIST (nt0).
--- 3328,3345 ----
;; disabled and the first nonterminal in START-LIST defines
;; the start symbol, like in case 2 above.
((not wisent-single-start-flag)
!
;; START-LIST is a list of nonterminals '(nt0 ... ntN).
;; Build and push ad hoc start rules in the grammar:
!
;; ($STARTS ((nt0) $1) ((nt1) $1) ... ((ntN) $1))
;; ($nt1 (($$nt1 nt1) $2))
;; ...
;; ($ntN (($$ntN ntN) $2))
!
;; Where internal symbols $ntI and $$ntI are respectively
;; nonterminals and terminals.
!
;; The internal start symbol $STARTS is used to build the
;; LALR(1) automaton. The true default start symbol used by the
;; parser is the first nonterminal in START-LIST (nt0).
***************
*** 3397,3403 ****
;; precedence of the last terminal in it.
(if (wisent-ISTOKEN item)
(setq pre item))
!
(aset ritem i item)
(setq i (1+ i)
rhs (cdr rhs)))
--- 3407,3413 ----
;; precedence of the last terminal in it.
(if (wisent-ISTOKEN item)
(setq pre item))
!
(aset ritem i item)
(setq i (1+ i)
rhs (cdr rhs)))
|