first of all... thank you very much for you work. Excellent!
I'm using a pyparsing for a project with ambiguous grammars.
I realized that I need a "GLR" solution.
A very simple example of the shortcoming of the current pyparsing
implementation is the following:
AB = Literal('AB')
node = AB ^ 'A' ^ 'BC'
g = OneOrMore(node)
x = g.parseString("ABC")
If you relax the requirement for Or to return the longest
match you could get an overall better match:
['A', 'BC'] is better than ['AB']
The idea is that a local optimization (longest match for Or)
does not imply a global optimization.
As a proof of concept I modified pyparsing to get the result.
The hack is really ugly (10 minutes work) but I think it shows
the concept: http://www.enuan.com/glr.tgz
Some more ideas:
1. Parse Element classes should not return a single ParseResults
instance chosen with local optimization but, optionally, return
the whole set of possible ParseResults (See ResultSet)
2. Expressions should evaluate recursively ALL solution.
the final result is a ResultSet
I think it could be viable to modify parse and parseImpl in order
to always return a ResultSet.
In the example above the resultset would be:
TODO BE VERIFIED
- Actions: without big changes it would be possible to support actions
that don't clobber globals.. do you think this is a strong limitation?
- It would be nice to rate the solutions. The default criterium could be
that of weighting all matches in this way:
1 (default char weight for Parse Element) * num_chars_matched
That could be done in parseString and should be customizable...
I'm going to invest a lot on pyparsing. But I defenitely need this
feature. I'd be more that happy in supporting you in any way...
Do you think this is something that we can expect to be available in