On Thu, 01 Nov 2007 19:19:20 -0700
"H. Peter Anvin" <hpa@...> wrote:
> Well, it's still fundamentally DWIM, since no parsing tricks in the
> world is going to tell you if "aaa" on a line by itself was meant as
> a label. Worse, if you get the line "dd aaa", *which one* is the
There are, indeed, some cases in which even a human expert must ask the
programmer what he was trying to do. However, there is a lot of
middle ground between the case of accepting a terminating colon as a
definitive indication of a label -- which we both agree upon -- and
attempting to deal with the above example.
The check for a colon strategy is a clear win for all cases where there
is no label, and, of course, for all cases in which the label is
terminated by a colon. The next most significant population is those
cases where there is a label which is not terminated by a colon, and
which is also not a valid operator [e.g. foo]. The strategy I mentioned
would conditionally accept foo as an operator, and attempt to validate
it, which would fail. Then it would accept foo as a label, and attempt
to validate the next token as the operator. The cost of this
reiteration is relatively small, since we have to have an operation
validation routine in any case.
In addition, we have the case where the first token is intended as a
label; does not have a terminating colon; and is a valid operator; but
there is a parse error later in the statement. The strategy will retry
the entire parse operation with the first token as a label, thus
allowing such unnatural statements as "mov mov eax,[mov]".
Now, getting back to your worst case examples, the strategy would
accept "aaa" on a line by itself as an instruction. If it were meant as
a label, then, presumably, it would be referred to somewhere else in
the code, which would generate an undefined label error. If the
programmer meant it as a label and never referred to that label, then
he (or she) deserves whatever happens. To warn sloppy coders against
such errors, we need only point out in the user manual that labels
which match valid operations should be colon terminated.
Likewise, the strategy selects "dd" as an operator and "aaa" as an
operand. If this is not the programmer's intention, then both "dd"
and "aaa" should have been referred to somewhere else in the code, which
would, as above, generate undefined label errors.