Re: [cedet-semantic] state in parsers
Brought to you by:
zappo
From: Kristof B. <kr...@re...> - 2015-11-03 14:34:24
|
Actually the first problem can be solved at the lexer, by remembering the indentation after the 'let', 'where', 'do' or 'of' keyword. The second problem is harder, since where to ignore insert ';' tokens would depend on the parser context. The mailer messed up the formatting, so I'll try again, hopefully it is right now: let a = 20 b = 30 and let a = 20 b = 30 are not the same thing. The first translates to let {a = 20 ;b = 30 } and the second to (the illegal) let {a = 20} b = 30 And for example the following let a = 20 b = 30 in a + b translates to let {a = 20 ;b = 30 }in a + b not to let {a = 30 ;b = 30 ;in a + b} Which is illegal. Where to insert ';' depends on the parser context, it's not possible to do at the lex phase. Regards, Kristof Bastiaensen On 03-11-15 15:21, Kristof Bastiaensen wrote: > Hi, > > the problem in haskell is that simply specifying indent is not enough, > for example: > > let a = 20 > b = 30 > > and > let a = 20 > b = 30 > > are not the same thing. The first translates to > > let {a = 20 > ;b = 30 } > > and the second to (the illegal) > > let {a = 20} > b = 30 > > And for example the following > > let a = 20 > b = 30 > in a + b > > translates to > > let {a = 20 > ;b = 30 > }in a + b > > not to > > let {a = 30 > ;b = 30 > ;in a + b} > > Which is illegal. Where to insert ';' depends on the parser really, > it's hard to do at the lex phase. > > Regards, > Kristof Bastiaensen > > > > On 03-11-15 13:22, Eric Ludlam wrote: >> On 11/02/2015 07:13 AM, Kristof Bastiaensen wrote: >>> Hi, >>> >>> I'd like to add support for haskell to semantic. Haskell syntax is >>> sensitive >>> to indentation, and I'd like to know if it is possible to use bovine or >>> wisent for >>> it. >> Yes. >> >>> The layout rule is explained here: >>> https://www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-210002.7 >>> >>> >>> What it does, is after certain keywords, whenever a '{' character is >>> ommited, it >>> enters a layout context. In the layout context it will remember the >>> indentation >>> of the next token. When a line has the same indentation as this token, >>> a ';' token is >>> inserted before the expression. Whenever the indentation is less than >>> this, a '}' token >>> is inserted, and the layout context is ended. The layout context is >>> also ended when a parser error would occur. >> The python grammar in semantic/wisent/python.wy and python.el is an ok >> example. The premise is that when you develop the lexer, you teach it >> to convert your indentation into lexical tokens that you then use in >> your grammar. >> >> In the python.el support file, you will notice it uses >> (current-indentation) and uses that to derive INDENT and DEDENT >> tokens. The grammar uses those to create INDENT and DEDENT blocks so >> it can recurse into code bodies. You could just as easily skip them >> too if you prefer. >> >>> So I would set a state variable to all the layout context indentations, >>> and then change the behaviour of the lexer or parser based on this >>> state. Is that possible? Would it interfere with incremental parsing? >> Definitely start with the lexer, and there is good documentation on >> creating a lexer in the semantic language developers guide. Use the >> lexer testing function to make sure it outputs what you expect, and >> then you can continue on with the grammar. >> >> Eric > > ------------------------------------------------------------------------------ > _______________________________________________ > cedet-semantic mailing list > ced...@li... > https://lists.sourceforge.net/lists/listinfo/cedet-semantic |