|
From: Simon J. <sje...@bl...> - 2004-06-21 14:30:59
|
Rui Nuno Capela wrote: >How can it be that a parser/syntax design is such that it can't resolve >around distinguishing a keyword from a parameter key name? In my humble >POV the position within the phrase would suffice to that. > >Is it a lex/yacc limitation or what, that a keyword is always taken >literally, prioritary and unconditianally independant from where it occurs >whithin a parsed sentence? > Its possible to make it work like you say, but its not efficient. What happens at the moment is: 1. The lexical analyser sees the input first, tokenises it, and gives the tokens to the parser. The analyser /doesn't know/ where in the sentence it is... it simply knows that CHANNEL <----- is a keyword it has been told to recognise Channel <----- is not a keyword so it must be a name 2.67 <----- is a number "CHANNEL" <----- is a quoted string "Channel" <----- is a quoted string "2.67" <----- is a quoted string etc etc 2. The parser receives the tokens from the analyser. It knows exactly where in the sentence it is, but a "sentence", from the parser's POV, is a sequence of tokens not a sequence of characters. The purpose of this is to make everything a lot smaller and faster: The analyser handles the small-scale structure of the language with simple pattern matching rules, whilst the parser only handles the large-scale structure of the language with more complex grammatical rules. But to make it work like this you need a language whose parts can be identified in the analyser by simple pattern matching, not in the parser by where they are in a sentence If you want to distinguish keywords by where they are in a sentence then you have to abandon the lexical analyser and do /everything/ in the parser. The parser would have to look at the incoming characters, one at a time, and decide according to context whether it was seeing a keyword or a name or a string or a number. This is possible, but its much more expensive than doing it in a lexical analyser using simple pattern-matching. To summarise: ~ Tokenising makes the parser smaller and faster. ~ Therefore the protocol should be tokeniseable. Unless there's a _strong_ reason not to. Simon Jenkins (Bristol, UK) |