I am going through the match attempts for my parser moz-ssql-parser while parsing
select * from task where build.product is not null and build.product!='firefox'
The parser does a lot of trackback as it tries to match the WHERE clause; much more than it should. Please change infixNotation()
to take advantage of the fact that clusters of infix binary operators always begin with an operand, alternate operator and operand, and end with an operand:
[build.product] is [not null] and [build.product] != ['firefox']
(you may notice my grammer did not include not
as a unary operator to infixNotation()
)
By matching this alternating pattern the matching process can go much faster. Precedence is handled with some post processing to orangize this into a tree. Of course, the difficultly will be splitting up a single infixNotation()
call, infected with unary and trinary operators, into a hierarchy of infixNotation()
calls containing only binary operators.
Thank you
If you have a sub-expression that is being repeatedly matched, try enabling packrat parsing: https://pythonhosted.org/pyparsing/pyparsing.ParserElement-class.html#enablePackrat
Packrat is already enabled: https://github.com/mozilla/moz-sql-parser/blob/dev/moz_sql_parser/sql_parser.py#L19
I suspect the number of KNOWN_OPS passed to
infixNotation
https://github.com/mozilla/moz-sql-parser/blob/dev/moz_sql_parser/sql_parser.py#L275 is too great for the givensys.setrecursionlimit(1500)
. Increasing the recursion limit solves the problem for simple cases, but real life expressions can quite a bit longer.infixNotation
is a piece of elgant code, but creates a parser that runs inO(2^n)
wheren
is the number of parsed operators. This can get out of hand quite fast, and the recursion limit is actually protecting the parser from trying too hard on pathelogical input.