Thread: [cedet-semantic] Noob wisent grammar/parser question
Brought to you by:
zappo
From: Thomas J. <tja...@gm...> - 2013-10-28 13:44:14
|
Hi! I'm currently working on a new Erlang grammar for semantic/wisent. Lexical analysis works fine, but I can't for the life of me figure out how to get the parser to return a tag with the bounds properly set. So far I just have a bare bones grammar with one rule available here: https://github.com/tjarvstrand/erl-parse To "reproduce" in Emacs24 - Install erlang-mode - Clone erl-parse and add it to the load-path - require erl-parse - Hit M-x eval-expression RET (erl-parse-string "foo()" 'function-call) RET Any help would be very much appreciated! Thanks, Thomas |
From: David E. <de...@ra...> - 2013-10-28 17:07:38
|
Thomas Järvstrand writes: > I'm currently working on a new Erlang grammar for semantic/wisent. Lexical > analysis works fine, but I can't for the life of me figure out how to get the > parser to return a tag with the bounds properly set. This is because you've defined your own tag generating macro CALL-TAG. If you look into wisent/grammar-macros.el, you'll see that for example in `wisent-grammar-FUNCTION-TAG', everything is finalized by `wisent-raw-tag', which appends the positional information. You either have to do this in your CALL-TAG macro, or you use the macros provided by Wisent (like FUNCTION-TAG). Good luck, David |
From: Thomas J. <tja...@gm...> - 2013-10-29 07:13:32
|
A true rookie mistake :-/ Awesome, thank you! T 2013/10/28 David Engster <de...@ra...> > Thomas Järvstrand writes: > > I'm currently working on a new Erlang grammar for semantic/wisent. > Lexical > > analysis works fine, but I can't for the life of me figure out how to > get the > > parser to return a tag with the bounds properly set. > > This is because you've defined your own tag generating macro > CALL-TAG. If you look into wisent/grammar-macros.el, you'll see that for > example in `wisent-grammar-FUNCTION-TAG', everything is finalized by > `wisent-raw-tag', which appends the positional information. You either > have to do this in your CALL-TAG macro, or you use the macros provided > by Wisent (like FUNCTION-TAG). > > Good luck, > David > |
From: David E. <de...@ra...> - 2013-10-29 18:45:02
|
Thomas Järvstrand writes: > A true rookie mistake :-/ I'm afraid you've left rookie-land quite some time ago when you delved into the grammar stuff. :-) I had to look this stuff up in the sources myself, because the Bovine parser appends the location information automatically... -David |
From: Thomas J. <tja...@gm...> - 2013-11-01 09:13:39
|
Another rookie question. I'm getting a shift reduce conflict in my Erlang grammar, similar to the dangling else problem<http://www.gnu.org/software/bison/manual/html_node/Shift_002fReduce.html>in the Bison manual and I don't seem to be able to solve it using the semantic precedence declarations. In erlang a macro is written as ?atom or ?atom([arguments]) In my grammar this translates to the rule: macro : WHY ATOM PAREN_BLOCK | WHY ATOM ; I've tried solving this by changing this to (using %nonassoc because I couldn't find any evidence of the %precedence declaration existing in semantic): %nonassoc PARAMETERIZED-MACRO %nonassoc MACRO ... %% macro : WHY ATOM PAREN_BLOCK %prec PARAMETERIZED-MACRO | WHY ATOM %prec MACRO ; But I still get a warning for a shift/reduce conflict when compiling the grammar. What is the correct way of solving this issue? Thanks, Thomas 2013/10/29 David Engster <de...@ra...> > Thomas Järvstrand writes: > > A true rookie mistake :-/ > > I'm afraid you've left rookie-land quite some time ago when you delved > into the grammar stuff. :-) I had to look this stuff up in the sources > myself, because the Bovine parser appends the location information > automatically... > > -David > |
From: David E. <de...@ra...> - 2013-11-01 20:54:43
|
Thomas Järvstrand writes: > In erlang a macro is written as ?atom or ?atom([arguments]) > > In my grammar this translates to the rule: > macro > : WHY ATOM PAREN_BLOCK > | WHY ATOM > ; First off, even if you get a shift/reduce conflict for this, the default is to do shift in such a case, so it should still work. Of course, if the problem is fixable in the grammar, it should be fixed. > I've tried solving this by changing this to (using %nonassoc because I couldn't > find any evidence of the %precedence declaration existing in semantic): > %nonassoc PARAMETERIZED-MACRO > %nonassoc MACRO No, %precedence does not exist. However, I think using %nonassoc has pretty much the same effect. As far as I can see, the difference in Bison is whether using the operator in an associative way is a run-time or compile-time error. That being said, I don't think you need to fiddle with precedence here. > macro > : WHY ATOM PAREN_BLOCK %prec PARAMETERIZED-MACRO > | WHY ATOM %prec MACRO > ; > > But I still get a warning for a shift/reduce conflict when compiling the > grammar. What is the correct way of solving this issue? I'd really need to see the full grammar to see the problem (the conflict might be due to interaction of separate rules), but this problem is usually dealt with by using an additional rule for an optional argument which contains an empty match, like macro : WHY ATOM optional-args ; optional-args : ;; EMPTY | PAREN_BLOCK (EXPAND $1 argument-list) ; argument-list: : ... deal with open/close-paren and list of arguments ... You should find many examples like this in other grammars, like in c.by the optional initialization of variables: varname-opt-initializer : semantic-list | opt-assign | ;; EMPTY ; or in java.wy you'll find lots of non-terminals ending in '_opt'. -David |
From: Thomas J. <tja...@gm...> - 2013-11-02 11:54:29
|
Yeah, it turns out that it was the way I was using the rule that was causing the warning. The problem is that depending on the macro definitions that are present ?foo(bar) can mean either "replace this expression with the value of the parameterized macro foo(a)" or "replace foo with the value of the unparameterized macro foo and call the result as a function with the argument bar". I guess I'm going to have to build a pre-processor :-/ T 2013/11/1 David Engster <de...@ra...> > Thomas Järvstrand writes: > > In erlang a macro is written as ?atom or ?atom([arguments]) > > > > In my grammar this translates to the rule: > > macro > > : WHY ATOM PAREN_BLOCK > > | WHY ATOM > > ; > > First off, even if you get a shift/reduce conflict for this, the default > is to do shift in such a case, so it should still work. Of course, if > the problem is fixable in the grammar, it should be fixed. > > > I've tried solving this by changing this to (using %nonassoc because I > couldn't > > find any evidence of the %precedence declaration existing in semantic): > > %nonassoc PARAMETERIZED-MACRO > > %nonassoc MACRO > > No, %precedence does not exist. However, I think using %nonassoc has > pretty much the same effect. As far as I can see, the difference in > Bison is whether using the operator in an associative way is a run-time > or compile-time error. That being said, I don't think you need to fiddle > with precedence here. > > > macro > > : WHY ATOM PAREN_BLOCK %prec PARAMETERIZED-MACRO > > | WHY ATOM %prec MACRO > > ; > > > > But I still get a warning for a shift/reduce conflict when compiling the > > grammar. What is the correct way of solving this issue? > > I'd really need to see the full grammar to see the problem (the conflict > might be due to interaction of separate rules), but this problem is > usually dealt with by using an additional rule for an optional argument > which contains an empty match, like > > macro > : WHY ATOM optional-args > ; > > optional-args > : ;; EMPTY > | PAREN_BLOCK > (EXPAND $1 argument-list) > ; > > argument-list: > : ... deal with open/close-paren and list of arguments ... > > > You should find many examples like this in other grammars, like in c.by > the optional initialization of variables: > > varname-opt-initializer > : semantic-list > | opt-assign > | ;; EMPTY > ; > > or in java.wy you'll find lots of non-terminals ending in '_opt'. > > -David > |
From: David E. <de...@ra...> - 2013-11-02 14:54:26
|
Thomas Järvstrand writes: > Yeah, it turns out that it was the way I was using the rule that was causing > the warning. The problem is that depending on the macro definitions that are > present ?foo(bar) can mean either "replace this expression with the value of > the parameterized macro foo(a)" or "replace foo with the value of the > unparameterized macro foo and call the result as a function with the argument > bar". I guess I'm going to have to build a pre-processor :-/ Aah, the joys of pre-processing. So if I understand you correctly, the problem you have is equivalent to this in C/C++: #define THEFUNC1 some_func #define THEFUNC2(x) 5*(x) a = THEFUNC1(3) // --> a = some_func(3) b = THEFUNC2(3) // --> b = 5*(3) Is this a correct description of your problem? If so, then Semantic can already deal with this through lex-spp, which does preprocessing as part of the lexing process. If you look at the definition of the C lexer, you'll see that it includes special lexers like `semantic-lex-cpp-define', which parses #define macro definitions, and `semantic-lex-spp-replace-or-symbol-or-keyword', which checks whether a symbol is actually a macro, expands it in-place and returns the correct lexical tokens. You can try it out by putting point on the 'a' and run semantic-lex-test: ((symbol 55 . 56) (punctuation 57 . 58) (symbol "some_func" 59 . 67) (semantic-list 67 . 70) (symbol 96 . 97) (punctuation 98 . 99) (number "5" 100 . 111) (punctuation "*" 100 . 111) (semantic-list #("(x)" 0 1 (macros (("x" number "3" 109 . 110)))) 100 . 111)) Handling the C/C++-preprocessor at the lexing stage is surprisingly complex, which is why lex-spp is a large package; it might be overkill for your use-case. But even if you don't end up using it, you might see some hints there on how to deal with this problem. Another possibility would be to not deal with this at the lexing stage, but do it afterwards in the tag expansion. For instance, look at how Semantic makes several tags out of things like int a,b=5,c=3; in semantic-expand-c-tag. This function could also do macro expansion. It's really a matter of what would fit Erlang better (which I know nothing of). -David |
From: Thomas J. <tja...@gm...> - 2013-11-03 20:42:59
|
Thanks for the extensive reply! Yes, your example captures the gist of it. Unfortunately I think macro expansion during lexing would be impractical because macros are usually defined in header files that are often included by using a path that is dependent on the compile-time environment. For example, Erlang's xunit, eunit, has assertion macros that are normally included with: -include_lib("eunit/include/eunit.hrl"). The compiler then expects to find something on its path named eunit or eunit-<version> which must contain the include/eunit.hrl file. I will have to think some more on it. As a start I'm going to use the parsing to figure out the arity of a function call and for that it's enough to be able to recognize ?foo(a)(b) as a single argument. Thanks Thomas 2013/11/2 David Engster <de...@ra...> > Thomas Järvstrand writes: > > Yeah, it turns out that it was the way I was using the rule that was > causing > > the warning. The problem is that depending on the macro definitions that > are > > present ?foo(bar) can mean either "replace this expression with the > value of > > the parameterized macro foo(a)" or "replace foo with the value of the > > unparameterized macro foo and call the result as a function with the > argument > > bar". I guess I'm going to have to build a pre-processor :-/ > > Aah, the joys of pre-processing. So if I understand you correctly, the > problem you have is equivalent to this in C/C++: > > #define THEFUNC1 some_func > #define THEFUNC2(x) 5*(x) > > a = THEFUNC1(3) // --> a = some_func(3) > b = THEFUNC2(3) // --> b = 5*(3) > > Is this a correct description of your problem? If so, then Semantic can > already deal with this through lex-spp, which does preprocessing as part > of the lexing process. If you look at the definition of the C lexer, > you'll see that it includes special lexers like > `semantic-lex-cpp-define', which parses #define macro definitions, and > `semantic-lex-spp-replace-or-symbol-or-keyword', which checks whether a > symbol is actually a macro, expands it in-place and returns the correct > lexical tokens. You can try it out by putting point on the 'a' and run > semantic-lex-test: > > ((symbol 55 . 56) > (punctuation 57 . 58) > (symbol "some_func" 59 . 67) > (semantic-list 67 . 70) > (symbol 96 . 97) > (punctuation 98 . 99) > (number "5" 100 . 111) > (punctuation "*" 100 . 111) > (semantic-list > #("(x)" 0 1 > (macros > (("x" number "3" 109 . 110)))) > 100 . 111)) > > Handling the C/C++-preprocessor at the lexing stage is surprisingly > complex, which is why lex-spp is a large package; it might be overkill > for your use-case. But even if you don't end up using it, you might see > some hints there on how to deal with this problem. > > Another possibility would be to not deal with this at the lexing stage, > but do it afterwards in the tag expansion. For instance, look at how > Semantic makes several tags out of things like > > int a,b=5,c=3; > > in semantic-expand-c-tag. This function could also do macro > expansion. It's really a matter of what would fit Erlang better (which I > know nothing of). > > -David > |