Menu

pyparsing module is not newbie friendly

2004-12-13
2013-05-14
  • Steven Siew

    Steven Siew - 2004-12-13

    Paul McGuire,

      Your module is very powerful and easy to use IF you know how to use it properly.

      But to be popular, it must also be easy to learn by newbies and this is where the documentation of your module falls flat into the mud.

      Let me give you an example of how hard it is to learn to use your module.

      Say I'm a newbie and I have found your module on some python website. Wow! Let's learn how to use this wonderful module.

      Step 1) Read the "Using the pyparsing module" document found in directory when I untar the pyparsing-1.2.2.tar.gz file

      Step 2) Run the greeting.py python script.
    $ python greeting.py
    Hello, World! -> ['Hello', ',', 'World', '!']

      Step 3) Copy the script and make changes to it.
    $ cp greeting.py siew.py
    $ vi siew.py
    $ cat siew.py
    # greeting.py
    #
    # Demonstration of the parsing module, on the prototypical "Hello, World!" example
    #
    # Copyright 2003, by Paul McGuire
    #
    from pyparsing import Word, alphas

    # define grammar
    greet = Word( alphas ) + "," + Word( alphas ) + "!"

    # input string
    hello = "Hello!, World!"

    # parse input string
    print hello, "->", greet.parseString( hello )

    Step 4) Run the changed siew.py script.
    $ python siew.py
    Hello!, World! ->
    Traceback (most recent call last):
      File "siew.py", line 16, in ?
        print hello, "->", greet.parseString( hello )
      File "/home/siewsk/python/pyparsing-1.2.2/examples/pyparsing.py", line 503, in parseString
        loc, tokens = self.parse( instring.expandtabs(), 0 )
      File "/home/siewsk/python/pyparsing-1.2.2/examples/pyparsing.py", line 452, in parse
        loc,tokens = self.parseImpl( instring, loc, doActions )
      File "/home/siewsk/python/pyparsing-1.2.2/examples/pyparsing.py", line 1169, in parseImpl
        loc, exprtokens = e.parse( instring, loc, doActions )
      File "/home/siewsk/python/pyparsing-1.2.2/examples/pyparsing.py", line 456, in parse
        loc,tokens = self.parseImpl( instring, loc, doActions )
      File "/home/siewsk/python/pyparsing-1.2.2/examples/pyparsing.py", line 727, in parseImpl
        raise exc
    pyparsing.ParseException: Expected "," (5), (1,6)

      Bang it falls flat in the mud. Now go back to the "Using the pyparsing module"  document and what does it says about how to handle farsing failures? Nothing! Absolutely no advice on how to use pyparsing to deal with parsing failures.

      The problem is that PARSING failures is why people will need to use the parsing module and as such it is vital to have very simple examples to show how to handle parsing failures in the "Using the pyparsing module" documentation.

      Newbies should NOT have to learn the pyparsing modules inside out to know how to handle a simple parsing failure in the hello world program.

      Also there is a terrible lack of quality tutorial and the documentation assumes the newbie knows about parsing theory inside out.

      Plus the examples directory does not give a guide for  which examples are the easiest (for newbies) and which are the hardest. So the Newbies have to look at every examples to evaluate which one to start learning with. Again did I mention the lack of quality tutorials. All this new terminogies like

    suppress
    ignore
    forward
    SkipTo
    Word

      And Word is not expained in details. What is the effect of the "optional second string" in the Word. When is it used? Why should it be used? It is so confusing for newbies.

      If you do not have time to write up a quality tutorial for pyparsing that is understandable but you could at least request help from your users to write a newbie's guide to pyparsing. Anything to help newbie get started with your module.

     
    • Paul McGuire

      Paul McGuire - 2004-12-13

      Steve -

      Yowch!!  Did I offend you in some way?  I didn't realize that the lack of tutorial material was deserving of such sarcasm and vitriol.  Nevertheless...

      You have certainly made your point, that for a utility that is supposed to make parsing easier, it needs more newbie-friendly tutorial material.  And this is especially true in the area of grammar debugging and troubleshooting.  Your example of a misformatted "Hello World" string is definitely important, since it is a simple-to-understand case where the input string does not match the parsing grammar.

      The examples directory has really grown mostly on its own, as different people have sent me some samples they thought would be helpful.  I can see that it would be helpful to have a README file for this directory that would list out what each example does, and what it is trying to illustrate.  I'll try to put one together before the next release.

      You'll actually find some 3rd party tutorial material at http://www.rexx.com/~dkuhlman/python_201/python_201.html#SECTION007600000000000000000, written by Dave Kuhlman - I could definitely include a link to this web page from the pyparsing page.  But I can already anticipate that this will fall far short of your expectations.

      I hope to have some time free up from work in the next few months, I can get working on more and clearer tutorial and documentation.  I'll be sure to include as many of your constructive suggestions as I can.

      -- Paul

       
    • Steven Siew

      Steven Siew - 2004-12-14

      Here is my python script which handles the Parsing Failure. I think you should put it in your "Using the pyparsing module" document as it show how to handle a simple parsing failure.

      # Parsing_example1.py
      #
      # Demonstration of the parsing module,
      #
      # Copyright 2003, by Paul McGuire
      #
      from pyparsing import Word, alphas, ParseException

      # define grammar
      greet = Word( alphas ) + "," + Word( alphas ) + "!"

      # input_string
      input_string=''

      # Display instructions on how to quit the program
      print 'greet = Word( alphas ) + "," + Word( alphas ) + "!"'
      print "Type in the string to be parse or 'quit' to exit the program"
      input_string = raw_input("> ")

      while input_string != 'quit':
      # obtain new input string

        # try parsing the input string
        try:
          L=greet.parseString( input_string )
        except ParseException:
          L=['Parse Failure',input_string]
       
        # show result of parsing the input string
        print input_string, "->", L

        input_string = raw_input("> ")

      print "Good bye!"

       
    • Steven Siew

      Steven Siew - 2004-12-14

      Paul,

        How do I turn off the ignore_whitespace attribute?

      I want my parser to fail "Hello , World!" but to pass "Hello,world!"

       
    • Paul McGuire

      Paul McGuire - 2004-12-14

      Calling leaveWhitespace() on a ParserElement will disable the skipping of whitespace before matching the characters in the ParserElement's defined pattern.

      In general, though, whitespace-skipping is part of the pyparsing philosophy.  This is an outgrowth of a project I worked on that *required* users to enter "2+2=4" as "2 + 2 = 4".  The original code was using a crude whitespace-based tokenizer to break up the expression tokens.

      If you're putting together a tutorial for newbies, I'd prefer if you focused on basic grammar features, options for accessing the parsed results, and debugging/troubleshooting.  Whitespace-sensitive parsing is a more advanced topic, that is often unnecessary even for very complex parsers.

      -- Paul

       
    • Paul McGuire

      Paul McGuire - 2004-12-17

      For at least some further information on Word (as well as other classes and their default constructor arguments), please look in the documentation in the htmldoc directory that is included with the release.

      -- Paul

       
    • Scot Wilcoxon

      Scot Wilcoxon - 2006-01-21

      "Calling leaveWhitespace() on a ParserElement" is an example of the problems with documentation for beginners.  This assumes that someone knows what a ParserElement is, and how to do "calling".  Much of the reference documentation is useless without knowing whether the proper incantations resemble Word(printables) or parseResult.Word()=printables.

      Yes, I know Word(printables) seems right.  But I still don't know why "MatchFirst( phrase_one, phrase_two, phrase_three, phrase_four )" gives an init error while "phrase_one | phrase_two | phrase_three | phrase_four" behaves differently.  But maybe I'm wrong in thinking that "|" in this case is indeed trying to do MatchFirst.

       
    • Paul McGuire

      Paul McGuire - 2006-01-21

      Scot -

      Pyparsing has only been around for about 2 years.  I've heard that much software documentation goes unread, and that most people go straight to the code samples.  Since I already have a full-time job, I've focused thus far on the code examples.  Hopefully, from them one can discern such things as how "calling" is performed.

      However, I am trying to do a bit more writing about pyparsing - I have two presentations to come out at PyCon, and an article on OnLamp to get posted soon.  I hope these added resources (when available I'll add links from my project web page) can fill in some of the most newbie of gaps.

      -- Paul

       
    • Scot Wilcoxon

      Scot Wilcoxon - 2006-01-22

      Well, such shorthand as "calling X" and forgetting that people might not know how to do that is common.    When you're used to working with "X" it also is common to forget that others don't know what "X" is.  Such assumptions sometimes extend up to the level of a company issuing a press release which announces proudly a new and improved version of software or hardware while neglecting to mention what the item is used for.

      Just keep writing about the stuff so we users can figure out additional wonderful uses for it.

       

Log in to post a comment.