Re: [Linuxsampler-devel] [linuxsampler] Latest CVS Commit

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Rui Nuno Capela wrote:

>How can it be that a parser/syntax design is such that it can't resolve
>around distinguishing a keyword from a parameter key name? In my humble
>POV the position within the phrase would suffice to that.
>
>Is it a lex/yacc limitation or what, that a keyword is always taken
>literally, prioritary and unconditianally independant from where it occurs
>whithin a parsed sentence?
>
Its possible to make it work like you say, but its not efficient.

What happens at the moment is:

1. The lexical analyser sees the input first, tokenises it, and gives the
tokens to the parser. The analyser  /doesn't know/  where in the sentence
it is... it simply knows that

CHANNEL             <----- is a keyword it has been told to recognise
Channel                   <----- is not a keyword so it must be a name
2.67                          <----- is a number
"CHANNEL"          <----- is a quoted string
"Channel"                <----- is a quoted string
"2.67"                       <----- is a quoted string
etc etc

2. The parser receives the tokens from the analyser. It knows exactly
where in the sentence it is, but a "sentence", from the parser's POV, is
a sequence of tokens not a sequence of characters.

The purpose of this is to make everything a lot smaller and faster: The
analyser handles the small-scale structure of the language with simple
pattern matching rules, whilst the parser only handles the large-scale
structure of the language with more complex grammatical rules.

But to make it work like this you need a language whose parts can be
identified in the analyser by simple pattern matching, not in the parser
by where they are in a sentence

If you want to distinguish keywords by where they are in a sentence
then you have to abandon the lexical analyser and do /everything/
in the parser.  The parser would have to look at the incoming characters,
one at a time, and decide according to context whether it was seeing
a keyword or a name or a string or a number. This is possible, but its
much more expensive than doing it in a lexical analyser using simple
pattern-matching.

To summarise:
~ Tokenising makes the parser smaller and faster.
~ Therefore the protocol should be tokeniseable.

Unless there's a _strong_ reason not to.

Simon Jenkins
(Bristol, UK)