Menu

#88 New Parsers (attached)

v1.0 (example)
open
nobody
parser (1)
5
2016-02-14
2016-02-13
Dan Strohl
No

In a parsign project I am workign on (validating domain names), I needed to be able to both validate the length of a token, and validate the number of tokens, so I created the attached additional parsers. In case they are of interest, I am passing them back to you if you want to include them. (I am not as familiar with sourceforge as I am with git, so I dont know how to do a pull request here, sorry).

These work from a basic POV, however I did not include any of the debug methods or other associated thigns that they probably need to fit in the eco-system, I am happy to add this stuff if you could give me an example or starting point. I looked at the existing ones, but was not able to easilly figure out which ones I need to override, and which ones I could just change a property or otherwise ignore.

1 Attachments

Discussion

  • Dan Strohl

    Dan Strohl - 2016-02-13

    And the tests for these.

     
    • Paul McGuire

      Paul McGuire - 2016-02-14

      Dan -

      Thanks for taking the time to write up these proposed classes to be added to
      Pyparsing.

      In the interests of keeping the API small and easy to learn, I have a high
      barrier for adding new classes to Pyparsing. In many of my own parsers, I
      will create small functions or closures to generate repetitive expressions
      or parse actions.

      Please look over these alternatives to your proposed new classes, mostly
      using variations on parse actions and conditions (newly added in a recent
      release):

      define some baseline expressions - an integer is a word made of nums, and

      an oddnum is an integer that ends with 1, 3, 5 ,7 or 9

      integer = Word(nums)

      integers = OneOrMore(integer)

      oddnum = integer().addCondition(lambda t: t[0][-1] in set('13579'))

      CountIn

      expr1 = integers()

      expr1.addCondition(lambda t: list(t).count(oddnum) == 2)

      Count

      expr2 = integers()

      expr2.addCondition(lambda t: len(t) == 3)

      Len

      expr3 = integers()

      expr3 = locatedExpr(expr3)

      expr3.addCondition(lambda t: t[0].locn_end - t[0].locn_start == 5)

      expr3.addParseAction(lambda t: t[0].value)

      for expr in (expr1, expr2, expr3):

      print expr.parseString("1 2 3")
      

      In any event, these feel fairly specialized to me still, so for the moment,
      I'm going to hold off on incorporating them into the standard Pyparsing
      release. For your application, you might consider making yourself these
      little macro functions (note that "expr()" is the new shorthand for
      "expr.copy()"):

      CountIn = lambda expr, match, n: expr().addCondition(lambda t:
      list(t).count(match) == n)

      Count = lambda expr, n: expr().addCondition(lambda t: len(t) == n)

      Len = lambda expr, n: locatedExpr(expr).addCondition(lambda t: t[0].locn_end
      - t[0].locn_start == n).addParseAction(lambda t: t[0].value)

      (I'm especially pleased with how easy CountIn is to write, using the
      standard count() method of lists to do equality checking, and using the '=='
      override that allows you to test the matching of an expression with a
      string, to give you the count of tokens that match another parse expression
      - in this case, finding the number of odd numbers in a list of matched
      integers.)

      Len was probably the one that gave me the most trouble, using the
      locatedExpr helper, a condition, and a parse action to return back the
      original matched tokens. But I would rather work with the actual start and
      end locations as the length to be evaluated, rather than running the tokens
      together using ''.join().

      Thanks for this submission - if you like, I can repackage them in the
      Pyparsing examples, as they are a novel and non-trivial use of some of the
      newer features in pyparsing.

      Regards,

      -- Paul


      This email has been checked for viruses by Avast antivirus software.
      https://www.avast.com/antivirus

       
      • Dan Strohl

        Dan Strohl - 2016-02-14

        I do have a request though, (or more of a suggestion I guess)..,

        for the examples / documentation, it woudl be really nice to have a list of the techniques / functions used per example, and possibly an index.of these... sometimes you note them in the descriptions, but other times it just says "A dice roll parser and evaluator for evaluating strings such as "4d20+5.5+4d6.takeHighest(3)".", which would be great if I was trying to figure out how to roll some dice, but not so much in telling me that it has an example of operatorPrecedence and CaselessLiteral in there.

        It's not a big thing, but it woudl be nice.

         
      • Dan Strohl

        Dan Strohl - 2016-02-14

        re: "But I would rather work with the actual start and
        end locations as the length to be evaluated, rather than running the tokens
        together using ''.join()."

        I thought about that, but I wanted to account for things like content replacemetns or not measuring .suppress()ed tokens in my measurements.

         
        • Paul McGuire

          Paul McGuire - 2016-02-15

          Good point, this is also a problem for originalTextFor (which I thought of
          using for Len instead that goofy locatedExpr mess, but it discards the
          originally parsed tokens).

          I've gotten a number of suggestions for similar recipes, parse action,
          pre-defined expressions (like a Regex for a floating point number). The
          itertools module contains a number of recipes in its documentation, maybe I
          should capture a bunch of these in an example or the docs. (One user took a
          stab at this in the public Pyparsing wiki, but it never got much traction.)

          -- Paul

          From: Dan Strohl [mailto:dstrohl@users.sf.net]
          Sent: Sunday, February 14, 2016 4:49 PM
          To: [pyparsing:bugs] 88@bugs.pyparsing.p.re.sf.net
          Subject: [pyparsing:bugs] Re: #88 New Parsers (attached)

          re: "But I would rather work with the actual start and
          end locations as the length to be evaluated, rather than running the tokens
          together using ''.join()."

          I thought about that, but I wanted to account for things like content
          replacemetns or not measuring .suppress()ed tokens in my measurements.


          This email has been checked for viruses by Avast antivirus software.
          https://www.avast.com/antivirus

           
  • Paul McGuire

    Paul McGuire - 2016-02-14

    Dan -

    Thanks for taking the time to write up these proposed classes to be added to
    Pyparsing.

    In the interests of keeping the API small and easy to learn, I have a high
    barrier for adding new classes to Pyparsing. In many of my own parsers, I
    will create small functions or closures to generate repetitive expressions
    or parse actions.

    Please look over these alternatives to your proposed new classes, mostly
    using variations on parse actions and conditions (newly added in a recent
    release):

    # define some baseline expressions - an integer is a word made of nums,
    

    and an oddnum is an integer that ends with 1, 3, 5 ,7 or 9

    integer = Word(nums)
    
    integers = OneOrMore(integer)
    
    oddnum = integer().addCondition(lambda t: t[0][-1] in set('13579'))
    
    # CountIn
    
    expr1 = integers()
    
    expr1.addCondition(lambda t: list(t).count(oddnum) == 2)
    
    # Count
    
    expr2 = integers()
    
    expr2.addCondition(lambda t: len(t) == 3)
    
    # Len
    
    expr3 = integers()
    
    expr3 = locatedExpr(expr3)
    
    expr3.addCondition(lambda t: t[0].locn_end - t[0].locn_start == 5)
    
    expr3.addParseAction(lambda t: t[0].value)
    
    for expr in (expr1, expr2, expr3):
    
        print expr.parseString("1 2 3")
    

    In any event, these feel fairly specialized to me still, so for the moment,
    I'm going to hold off on incorporating them into the standard Pyparsing
    release. For your application, you might consider making yourself these
    little macro functions (note that "expr()" is the new shorthand for
    "expr.copy()"):

    CountIn = lambda expr, match, n: expr().addCondition(lambda t:
    

    list(t).count(match) == n)

    Count = lambda expr, n: expr().addCondition(lambda t: len(t) == n)
    
    Len = lambda expr, n: locatedExpr(expr).addCondition(lambda t:
    

    t[0].locn_end - t[0].locn_start == n).addParseAction(lambda t: t[0].value)

    (I'm especially pleased with how easy CountIn is to write, using the
    standard count() method of lists to do equality checking, and using the '=='
    override that allows you to test the matching of an expression with a
    string, to give you the count of tokens that match another parse expression
    - in this case, finding the number of odd numbers in a list of matched
    integers.)

    Len was probably the one that gave me the most trouble, using the
    locatedExpr helper, a condition, and a parse action to return back the
    original matched tokens. But I would rather work with the actual start and
    end locations as the length to be evaluated, rather than running the tokens
    together using ''.join().

    Thanks for this submission - if you like, I can repackage them in the
    Pyparsing examples, as they are a novel and non-trivial use of some of the
    newer features in pyparsing.

    Regards,

    -- Paul


    This email has been checked for viruses by Avast antivirus software.
    https://www.avast.com/antivirus

     
  • Dan Strohl

    Dan Strohl - 2016-02-14

    Thanks, I didnt see the .addCondition() method, (I was looking for something like that, I thought about using .addAction(), but I was not sure if raising an exception at that point was a good idea.

    No problem on not including them, especially since it looks pretty easy to do without these. (I am always a fan of keeping things simple.)

     
  • Dan Strohl

    Dan Strohl - 2016-02-14

    Ok, actually, in looking again, I did see the addCondition, but was not sure how to use it, the docs are pretty light for that method.

     

Log in to post a comment.