Hi,
I'm trying to parse a potentially complex quoted string. I've used the
quotedString with success for simple cases, unfortunately the grammar I
need it a bit more complex. I need it to take into account an escape
char "\" and also throw an exception when the number of quotes is not even.
Like a bash script it must follow these rules, and maybe more I'm
forgetting.
* Quotes must be same type and paired, otherwise throw exception
(how do I force it to do that?)
* An unquoted or double quoted escape char encodes next char and
escape is removed (including quotes)
* A single quoted escape char is ignored and left unchanged
* An escaped quote at the edge or middle of a word should not be
split up, should stay where it is
I've got this much code for this problem, but it doesn't meet the
criteria above. The escape chars are not found before the double
quotes, and interior quotes are being split on. Also I need and
exception to be thrown for the last token.
#!/bin/env python
from pyparsing import *
cmdstring = r''' echo \z "\Hello" '\Hello' 'Hel\lo' \'Hello' '''
print cmdstring
def encode(s, loc, match):
'encode a char into url format'
return '%%%s' % ord(match[0])
# define grammar
sq_args = sglQuotedString.setParseAction(removeQuotes)
dq_args = dblQuotedString.setParseAction(removeQuotes)
args = Word(alphanums + '_-=/') #CharsNotIn(''' "' ''')
nextchar = Word(printables, exact=1)
nextchar.setParseAction(encode)
escaped = Suppress('\\') + nextchar
statement = ( ZeroOrMore(
sq_args |
escaped |
dq_args |
args
) )
try:
print statement.parseString(cmdstring)
except ParseException, pe:
print pe
Thanks if you can help,
-Mike
-------------------------------------------------------------
Mike Miller Earth, Sol, Orion Arm, Milky Way
|