Menu

scripting language

Anonymous
2006-02-25
2013-05-14
  • Anonymous

    Anonymous - 2006-02-25

    Hi,

    I'm hoping to implement a little scripting language like sh or csh with a python slant in python.  I had already written a parser to do it, but it always seemed a little brittle.

    I thought I'd give pyparsing a try...  It is very easy to get the easy stuff going but I'm totally lost when it comes more complicated things.  None of the examples se

    Here's where I've gotten so far.  I'm trying to get the "\" escape char to negate the quotes, but I don't know how to fit it into the statement.

    Also, how can I give an error if have an odd number of quotes before a newline?

    Thanks in advance if anyone can help,
    Mike

    ----------------

    from pyparsing import *
    import string

    if len(sys.argv) > 1:
        cmdstring = string.join(sys.argv[1:])
    else:
        cmdstring = '''alias dude=holmes; echo \"one two" 'three ' && ver # nuthin more\n'''

    # define grammar
    keywords = ('alias', 'echo', 'setenv', 'ver')  # to be expanded later
    keyword = oneOf( string.join(keywords) )
    argument = Word(alphanums + '_-=/')
    quoted_arg = (  Suppress("'") + CharsNotIn("'") + Suppress("'") ^
                    Suppress('"') + CharsNotIn('"') + Suppress('"') )
    contmode = oneOf( '; | || & &&' ).setResultsName('contmode')
    escapes = Literal('\\') + Word(printables,exact=1)

    statement = Group( keyword + ZeroOrMore(quoted_arg) + ZeroOrMore(argument) +
                        Optional(contmode, default=';') )
    compound_statement = OneOrMore(statement)
    compound_statement.ignore(pythonStyleComment)

    # parse
    print compound_statement.parseString(cmdstring)

     
    • Paul McGuire

      Paul McGuire - 2006-02-27

      Mike -

      I ran your code and it appears to work, but here are some comments:

      1. oneOf takes a list of words, but they must be whitespace-separated.  You should change:
         keyword = oneOf( string.join(keywords) )
      To
         keyword = oneOf( " ".join(keywords) )

      2. Is there a problem using the quotedString built-in in pyparsing?  I think this will handle the '\' character escaping you are looking for.

      Lastly, your approach looks more like you are tokenizing - not that there's anything wrong with that! - when in fact, you can define separate sub-grammars by keyword, and have pyparsing do more semantic processing for you.  For example:

      stringExpr = quotedString  #expand this to handle complex string expressions
      echoCmd = Literal("echo") + stringExpr.setResultsName("echoText")

      Now you can make echoCmd part of a larger grammar of your shell commands, and dispatch directly from the parsed results, instead of having pyparsing just break up your string into tokens and have some other batch of code retrace many of pyparsing's steps in traversing through the list of tokens to interpret them semantically.

      My presentation at PyCon implemented a pyparsing->Command pattern, using a text adventure game as an example.  I'll post that code as soon as I get access back to my web-page (grrrrr!), and post a notice on the pyparsing SF news page.

      -- Paul

       
    • Anonymous

      Anonymous - 2006-02-28

      Thanks for the reply, Paul.

      Regarding #1, both do the same thing, I don't use the second because I find it unintuitive, although string.join isn't much better ... oh well, not important, it's a python issue.

      2.  The quotedString returned the quotes too, which I didn't want really.  It ignored and deleted the backslash escapes.

      Actually I'm not sure what direction I'm going with this.  The reason I am trying to tokenize it all is that I am just trying to get the syntax to work first before I'm going to try to execute  anything.  Another reason is that the command line may be multiple statements joined by semi-colons, maybe on multiple lines. 

      How can I start validating a statement before even knowing if is single or multiple?  Also, I want to accept regular python statements too, so I need to look at the line and identify it first.

      There is still tons of stuff to figure out, like backslash escapes, redirection, brace,tilde, and command `` expansions, etc.  Even the simple example above doesn't always work if I change something little here or there.

      So I'm not sure this is the correct way to handle the issues or not, I'm asking for advice not just on the module but even on how to approach the problem in general.

      Sorry this is a bit of a drag, I don't mean to burden anyone with solving my problem.  ;)

      On the bright side I've already got a working parser I wrote myself, but I'd like to use something more general, robust, and not have to reinvent the wheel.

       
    • Anonymous

      Anonymous - 2006-02-28

      Here's a newer version, but it doesn't catch the newline as an ending at the end of the first line.  :(

      ===========================================

      #!/bin/env python
      from pyparsing import *
      import string

      if len(sys.argv) > 1:
          cmdstring = string.join(sys.argv[1:])
      else:
          cmdstring = '''alias dude=holmes; echo \"one two\" 'three ' && ver # nuthin more
          alias dir = ls -l; echo "one two" 'three ' && ver -h >>/dev/null
      '''

      # define grammar
      ParserElement.setDefaultWhitespaceChars(' \t')
      keywords    = ('alias', 'echo', 'setenv', 'ver')  # to be expanded later
      keyword     = oneOf( string.join(keywords) )
      argument    = Word(alphanums + '_-=/')
      redirector  = oneOf('>e> >e>> < << > >> >3> >3>>')
      path        = Word(printables)
      redirection = redirector + path

      quoted_arg  = ( Suppress("'") + CharsNotIn("'") + Suppress("'") |
                      Suppress('"') + CharsNotIn('"') + Suppress('"') )
      #quoted_arg  = quotedString

      contmode    = oneOf( '; | || & &&' ).setResultsName('contmode')
      escapes = Literal('\\') + Word(printables,exact=1)

      statement = Group(
          keyword +
          ZeroOrMore(escapes) +
          ZeroOrMore(quoted_arg) +
          ZeroOrMore(argument) +
          ZeroOrMore( Group(redirection) ) +
          Optional(contmode, default=';')
          )
      # ZeroOrMore(escapes) +
      compound_statement = OneOrMore(statement) + LineEnd().suppress()
      compound_statement.ignore(pythonStyleComment)

      multi_line_stm = OneOrMore(compound_statement)

      # parse
      if '\n' in cmdstring:   print multi_line_stm.parseString(cmdstring)
      else:                   print compound_statement.parseString(cmdstring)

       
    • Paul McGuire

      Paul McGuire - 2006-02-28

      Mike -

      Hunh! I never used string.join that way.  I guess I just stay away from using the string module, since it is supposed to go away at some point.

      As you say, when you parse a quoted string, you are not often very interested in the quotes.  Pyparsing includes a built-in parse action for removing them.  Try this:

      quoted_arg = quotedString.setParseAction( removeQuotes )

      What I would do in your case would be to build up my scripting language a command at a time.  So with your language, start with dir and echo.  echo will require a definition for a string expression, but start with something very simple, just one or more quoted strings which our parser will concatenate together.

      stringExpr = OneOrMore(quotedString.setParseAction(removeQuotes))
      stringExpr.setParseAction( lambda s,l,t: "".join(t) )

      echoCmd = Keyword("echo") + stringExpr.setResultsName("echoString")
      dirCmd = Keyword("dir") + filespec

      cmds = echoCmd | dirCmd

      Of course, this is your project, so you are better off with whatever approach makes most sense to you.

      I think you can take it from there.

      -- Paul

       
    • Anonymous

      Anonymous - 2006-03-06

      Thank you very much for the advice.  It seems to be working.

      I'm still not clear on how to handle backslash escape chars ... like "foo \&quot;bar\&quot; ".  Does anyone know?

       
    • Paul McGuire

      Paul McGuire - 2006-03-10

      Mike -

      What do you mean by "handle"?  Do you mean "get rid of the backslashes and translate the escaped char"?  You might want to do this with a parse action attached to quotedString, something like:

      def unescapeBackslashes(s,l,t):
          #expand this list as necessary - last item in list escapes \\ -> \     escapes = ((r"\t","\t"), (r"\b","\b"), (r"\f","\f"), (r"\n","\n"), ("\\\\&quot;,"\\&quot;))
          tmp = t[0]
          for lit,rep in escapes:
              tmp = tmp.replace(lit,rep)
          return tmp
         
      sampleData = r"""'This is some sample code containing\tbackslashes that\nshould be converted.'"""
         
      from pyparsing import *
      qtString = quotedString.setParseAction(unescapeBackslashes)
         
      print qtString.parseString(sampleData)[0]

      -- Paul

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.