Re: [q-lang-users] New stuff in cvs: multichar ops, views
Brought to you by:
agraef
From: Albert G. <Dr....@t-...> - 2007-05-31 14:17:27
|
> Rob Hubbard wrote: >> It seems that now, introducing a new symbol will affect the way that >> code is parsed. This is something I find a little worrying. > > You're right. Right now the lexer inspects the symbol table to partition > punctuation symbols. I agree that this is a bad idea since it makes the > syntax depend on the declared operator symbols. I will fix that right > away. Well, it sounded like a good idea, but actually it isn't. Applying a naive "maximal munch" rule breaks quite a lot of existing code, since code like '[0..#B-1]' then becomes a syntax error ('..#' is flagged as undefined, instead of parsing it as two lexemes '..' and '#'). Just excluding special lexemes like '..' from the maximum munch rule doesn't work either since then you couldn't define an operator like '.*' or ':+'. So I guess that we'll just have to live with the fact that if you declare an operator symbol then you're actually changing the lexical syntax of the language (which is already the case with operators like (xor) anyway, it's just not so blatantly obvious). I'll add a warning about this to the manual. Note, however, that the module system does help with stuff like this, since just adding an operator to your own script doesn't change the way that, say, the standard library modules are parsed, since your definition is not in scope there. It's just that you have to be careful with your own operator declarations. If you declare an operator like 'public (..#) X Y;' then you can't write something like '[0..#B-1]' in the scope of that definition and expect it to mean '[0 .. #B-1]'. If you do silly things like that (i.e., introduce an operator symbol which ends in something which can also be interpreted as a unary operator) then you get what you called for. ;-) Sharp knife and all that... Ok, here's the "maximal munch" rule as it is implemented right now. I actually think that it works pretty well; at least it doesn't disrupt any existing code that I've tried. MAXIMAL MUNCH RULE. Operator symbols consisting of punctuation are generally parsed using the "longest possible lexeme" a.k.a. "maximal munch" rule. More precisely, this means that in a _declaration_ like 'public (+-&%) X Y;' the symbol being declared always extends up to the closing ')' delimiter. Outside of declarations, however, the "longest possible lexeme" refers to the longest prefix of the input such that the sequence of punctuation characters actually forms a _valid_, i.e., declared or reserved, symbol. Thus, e.g., '..#' will actually be parsed as '.. #' (reserved '..' symbol followed by a '#' operator). Cheers, Albert -- Dr. Albert Gr"af Dept. of Music-Informatics, University of Mainz, Germany Email: Dr....@t-..., ag...@mu... WWW: http://www.musikinformatik.uni-mainz.de/ag |