Menu

pyparsing grammar debugging and profiling

2016-09-10
2016-09-30
  • Philippe Lagadec

    Hello,
    I have developed quite a large grammar with pyparsing (to parse VBA macros), with good results until now.
    The parsing has become very slow, to the point it takes minutes to parse a few simple lines of code.
    So I guess there is something wrong in my grammar, which provokes a lot of backtracking and recursion.
    Is there a simple way to debug a grammar, and find out which part consumes most of the time?
    Thank you,

    Philippe.

     
  • Paul McGuire

    Paul McGuire - 2016-09-10

    Have you enabled packrat parsing? This feature recently underwent a major rewrite, with significant memory and performance improvements. Online docs discuss enabling this feature here: https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#enablePackrat

     
  • Paul McGuire

    Paul McGuire - 2016-09-10

    As to your question about how to profile the performance of a grammar, when I do performance profiling, I do it at the pyparsing class level, so this is internal to the module itself. Externally, I expose some debugging actions, which you can configure and enable with setDebugActions and setDebug. You can assign different actions to these 3 phases of parsing each expression: 1: about to parse; 2: successful parse; 3: failed parse. You could tally up the different numbers of attempts, successes and failures by expression, and then perhaps that data would give you some insights into where your grammar is spending its time. For instance, are you using Or when you could use MatchFirst? Or when using MatchFirst, are you testing the most common expressions first?

     
  • Philippe Lagadec

    Thanks a lot, packrat parsing works great! Just by enabling it, the parsing takes seconds or even less, instead of minutes.

     
  • Philippe Lagadec

    BTW, I just published my VBA parser: https://github.com/decalage2/ViperMonkey

     

Log in to post a comment.