The L"wide string" notation should probably be handled in the Lexical
analyzer. This could be an easy piece in semantic 2, but in 1.4,
you need to modify semantic-c-flex-extensions with something like
this:
"L\\s\""
which would then need the existing string logic hidden in the lexer
to finish the task.
As a rule, you could define a token L, and make a string rule that
had an optional L in front, but I'm not sure if it would mess up if a
user tried to define a function or variable L.
For example, the token FLOAT "float" confused the situation where the
user wrote:
#include <float.h>
which required some goofy stuff in c.bnf.
Eric
>>> "Berndl, Klaus" <klaus.berndl@...> seems to think that:
>Hi,
>
>see below the posting arrived yesterday. I have already fixed c.bnf of
>v1p4 so now it can parse sequences of strings like "klaus" "berndl"
>correct and also the builtin type wchar_t (BTW: Is there also a wint_t t=
ype?).
>
>But what would be the best way to parse these L"a wide-string" and L'a'
>(a wide-char-literal correct? Could this be done with c.bnf (my first tr=
ies
>had no success) or should this be done with semantic-flex-extensions or =
what
>other ways exist??
>
>Thanks for help,
>Klaus
>
>
>
>-----Original Message-----
>From: Markus Sch=F6pflin [mailto:markus.schoepflin@...]=
=20
>Sent: Tuesday, January 28, 2003 11:52 AM
>To: cedet-semantic@...
>Subject: [cedet-semantic] Parsing of C string constants
>
>
>Currently, C string constants are not always parsed correctly. Here is=20
>an example.
>
>char const *p =3D ""; // ok
>char const *q =3D "" ""; // not ok
>
>wchar_t const *wp =3D L""; // not ok
>wchar_t const *wq =3D L"" L""; // not ok
>
>All examples are legal according to the C++ standard.
>
>Here is the definition from the standard:
>
>string-literal:
> "s-char-sequence(opt)"
> L"s-char-sequence(opt)"
>
>s-char-sequence:
> s-char
> s-char-sequence s-char
>
>s-char:
> any member of the source character set except the doublequote ",=20
>backslash \, or newline character
> escape-sequence
> universal-character-name
>
>2.3.14(3): In translation phase 6 (2.1), adjacent narrow string=20
>literals are concatenated and adjacent wide string literals
>are concatenated. [...]
>
>2.3.14(5): Escape sequences and universal-character-names in string=20
>literals have the same meaning as in character literals (2.13.2),=20
>except that the single quote ' is representable either by itself or by=20
>the escape sequence \', and the double quote " shall be preceded by a=20
>\. [...]
>
>A universal character name is defined as:
>
>hex-quad:
> hexadecimal-digit hexadecimal-digit hexadecimal-digit=20
>hexadecimal-digit (Four hex digits in a row)
>
>universal-character-name:
> \u hex-quad
> \U hex-quad hex-quad
>
>Escape sequences are defined in the usual way, there are \n, \r, ...,=20
>octal escape sequences (like \123, length from 1 to 3) and hex=20
>sequences (like \xabcd, length from 1 to unlimited).
>
>HTH, Markus
[ ... ]
--=20
Eric Ludlam: zappo@..., eric@...=
om
Home: http://www.ludlam.net Siege: http://www.siege-engine.com
Emacs: http://cedet.sourceforge.net GNU: http://www.gnu.org
|