Hello,
I have developed quite a large grammar with pyparsing (to parse VBA macros), with good results until now.
The parsing has become very slow, to the point it takes minutes to parse a few simple lines of code.
So I guess there is something wrong in my grammar, which provokes a lot of backtracking and recursion.
Is there a simple way to debug a grammar, and find out which part consumes most of the time?
Thank you,
Philippe.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As to your question about how to profile the performance of a grammar, when I do performance profiling, I do it at the pyparsing class level, so this is internal to the module itself. Externally, I expose some debugging actions, which you can configure and enable with setDebugActions and setDebug. You can assign different actions to these 3 phases of parsing each expression: 1: about to parse; 2: successful parse; 3: failed parse. You could tally up the different numbers of attempts, successes and failures by expression, and then perhaps that data would give you some insights into where your grammar is spending its time. For instance, are you using Or when you could use MatchFirst? Or when using MatchFirst, are you testing the most common expressions first?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have developed quite a large grammar with pyparsing (to parse VBA macros), with good results until now.
The parsing has become very slow, to the point it takes minutes to parse a few simple lines of code.
So I guess there is something wrong in my grammar, which provokes a lot of backtracking and recursion.
Is there a simple way to debug a grammar, and find out which part consumes most of the time?
Thank you,
Philippe.
Have you enabled packrat parsing? This feature recently underwent a major rewrite, with significant memory and performance improvements. Online docs discuss enabling this feature here: https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#enablePackrat
As to your question about how to profile the performance of a grammar, when I do performance profiling, I do it at the pyparsing class level, so this is internal to the module itself. Externally, I expose some debugging actions, which you can configure and enable with setDebugActions and setDebug. You can assign different actions to these 3 phases of parsing each expression: 1: about to parse; 2: successful parse; 3: failed parse. You could tally up the different numbers of attempts, successes and failures by expression, and then perhaps that data would give you some insights into where your grammar is spending its time. For instance, are you using Or when you could use MatchFirst? Or when using MatchFirst, are you testing the most common expressions first?
Thanks a lot, packrat parsing works great! Just by enabling it, the parsing takes seconds or even less, instead of minutes.
BTW, I just published my VBA parser: https://github.com/decalage2/ViperMonkey