On Tue, May 6, 2008 at 12:42 AM, Gre7g Luterman <haf...@ya...> wrote:
> Hey guys! My parser has been working great, but I've
> run into a small snag in parsing C-style escapes. For
> example:
>
> >>> import pyparsing as PP
> >>> P = PP.QuotedString(quoteChar='"', escChar="\\")
> >>> P.parseString('"This is a quote: \\" This is a
> CR: \\n"')
> (['This is a quote: " This is a CR: n'], {})
>
> As you can see, specifying escChar="\\" worked well at
> first, as the parser recognized that \" is a character
> in the string and not an end of quote. HOWEVER, when
> the \n didn't match the \" pattern it was looking for,
> instead of leaving it alone, it completely dropped the
> \.
>
> I need it to leave that \ intact so that I can find
> other string constants such as \n, \7, \xFF, etc. I
> do this by re-parsing the strings with my code.
> Perhaps that is not best? What should I do?
>
I stumbled upon the same problem some time ago. According to Paul,
there is a bug in QuotedString. You can find a discussion of the
problem, and a workaround here:
http://pyparsing.wikispaces.com/message/view/home/3778969
Here is what I ended up with:
-----
ffrom pyparsing import Regex
import re
s = '"This is a quote: \\" This is a CR: \\n"'
qstring2 = Regex(r'\"(?:\\\"|\\\\|[^"])*\"', re.MULTILINE)
print qstring2.parseString(s)
Output:
['"This is a quote: \\" This is a CR: \\n"']
----
Hope this helps!
- Kjell Magne Fauske
|