Menu

#18 infixNotation() might be made better

v1.0 (example)
open
nobody
None
5
2018-08-14
2018-08-02
No

I am going through the match attempts for my parser moz-ssql-parser while parsing

select * from task where build.product is not null and build.product!='firefox'

The parser does a lot of trackback as it tries to match the WHERE clause; much more than it should. Please change infixNotation() to take advantage of the fact that clusters of infix binary operators always begin with an operand, alternate operator and operand, and end with an operand:

[build.product] is [not null] and [build.product] != ['firefox']

(you may notice my grammer did not include not as a unary operator to infixNotation())

By matching this alternating pattern the matching process can go much faster. Precedence is handled with some post processing to orangize this into a tree. Of course, the difficultly will be splitting up a single infixNotation() call, infected with unary and trinary operators, into a hierarchy of infixNotation() calls containing only binary operators.

Thank you

Discussion

  • Kyle Lahnakoski

    Kyle Lahnakoski - 2018-08-14

    Packrat is already enabled: https://github.com/mozilla/moz-sql-parser/blob/dev/moz_sql_parser/sql_parser.py#L19

    I suspect the number of KNOWN_OPS passed to infixNotation https://github.com/mozilla/moz-sql-parser/blob/dev/moz_sql_parser/sql_parser.py#L275 is too great for the given sys.setrecursionlimit(1500). Increasing the recursion limit solves the problem for simple cases, but real life expressions can quite a bit longer.

    infixNotation is a piece of elgant code, but creates a parser that runs in O(2^n) where n is the number of parsed operators. This can get out of hand quite fast, and the recursion limit is actually protecting the parser from trying too hard on pathelogical input.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.