Won't match WordEnd after optional, and absent, element
Brought to you by:
ptmcg
In the following example, I expect a match to be found with {'a': 'A', 'b': ''}, but no match is found.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #!/usr/bin/env python3 #coding=utf-8 from pyparsing import * text = 'ABC' a = Literal('A') b = oneOf(['B','']) pattern = Combine(a.setResultsName('a') + b.setResultsName('b') + WordEnd('A')) pattern.parseString(text) |
Using "Optional" instead of "oneOf" yields no match either.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #!/usr/bin/env python3 #coding=utf-8 from pyparsing import * text = 'ABC' a = Literal('A') b = Literal('B') pattern = Combine(a.setResultsName('a') + Optional(b.setResultsName('b')) + WordEnd('A')) pattern.parseString(text) |
What you describe is the intended behavior of WordEnd. In your example, 'ABC', there is no word break after 'A' or 'AB'. For WordEnd to match, you would have to parse a string like 'A BC', 'AB C', 'AB(C'. 'AB' has to be followed by a character that is not in the normal set of word characters. What are you trying to accomplish with this usage of WordEnd?
Please note that I am giving an argument to WordEnd. It is my understanding that it specifies what characters are allowed in the word:
If you leave out the b part, it works as I would expect, including matching a WordEnd right after the 'A':
I am parsing values for electronic components, such as resistors, capacitors and inductors. As is often the case in electronics, these are written with the unit of measurement left out, but with an optional unit prefix still present, i.e., it might say "100 k" instead of "100 kΩ".
Somewhat simplified, my matching pattern therefore looks like this:
Any chance of having this looked at, you think?
WordEnd not only looks forward but also looks backward. So WordEnd('A') can
only succeed if the previous character is an 'A'. In both of your cases,
the previous character is 'B', so WordEnd will fail.
-- Paul
From: Jonas Olson [mailto:bromskloss@users.sf.net]
Sent: Thursday, October 02, 2014 1:17 PM
To: [pyparsing:bugs]
Subject: [pyparsing:bugs] #75 Won't match WordEnd after optional, and
absent, element
Any chance of having this looked at, you think?
[bugs:#75] http://sourceforge.net/p/pyparsing/bugs/75 Won't match WordEnd
after optional, and absent, element
Status: open
Group: v1.0 (example)
Created: Sun Sep 07, 2014 04:02 PM UTC by Jonas Olson
Last Updated: Sun Sep 07, 2014 04:42 PM UTC
Owner: nobody
In the following example, I expect a match to be found with {'a': 'A', 'b':
''}, but no match is found.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
!/usr/bin/env python3
coding=utf-8
from pyparsing import *
text = 'ABC'
a = Literal('A')
b = oneOf(['B',''])
pattern = Combine(a.setResultsName('a') +
b.setResultsName('b') +
WordEnd('A'))
pattern.parseString(text)
Using "Optional" instead of "oneOf" yields no match either.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
!/usr/bin/env python3
coding=utf-8
from pyparsing import *
text = 'ABC'
a = Literal('A')
b = Literal('B')
pattern = Combine(a.setResultsName('a') +
Optional(b.setResultsName('b')) +
WordEnd('A'))
pattern.parseString(text)
Sent from sourceforge.net because you indicated interest in
https://sourceforge.net/p/pyparsing/bugs/75/
https://sourceforge.net/p/pyparsing/bugs/75
To unsubscribe from further messages, please visit
https://sourceforge.net/auth/subscriptions/
https://sourceforge.net/auth/subscriptions
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
Related
Bugs: #75
Actually, in the examples of my original post, I expect the pattern to match just the 'A' of the input string 'ABC'. More precisely, subpattern
a
would match 'A' and subpatternb
would match ''. The next character would thus be 'B', which would constitute a WordEnd.Well, since the is a 'B' there, then subpattern b (whether using oneOf or
Optional) will match the 'B'. At that point, the next WordEnd('A') will
fail. There is no backtracking to undo the match of the letter 'B' since it
was optional to see if maybe the WordEnd will match. I would post a working
example, but I don't really get why you are including both an Optional('B')
and a WordEnd('A'), which must fail if the 'B' is present. If you change
it to WordEnd('AB'), then it makes a little more sense to me.
And I've never really considered using oneOf with a list including an empty
string, in place of Optional. It is not really how oneOf was intended to be
used - does it work?
-- Paul
From: Jonas Olson [mailto:bromskloss@users.sf.net]
Sent: Thursday, October 02, 2014 3:12 PM
To: [pyparsing:bugs]
Subject: [pyparsing:bugs] #75 Won't match WordEnd after optional, and
absent, element
WordEnd not only looks forward but also looks backward. So WordEnd('A') can
only succeed if the previous character is an 'A'. In both of your cases,
the previous character is 'B', so WordEnd will fail.
Actually, in the examples of my original post, I expect the pattern to match
just the 'A' of the input string 'ABC'. More precisely, subpattern a would
match 'A' and subpattern b would match ''. The next character would thus be
'B', which would constitute a WordEnd.
[bugs:#75] http://sourceforge.net/p/pyparsing/bugs/75 Won't match WordEnd
after optional, and absent, element
Status: open
Group: v1.0 (example)
Created: Sun Sep 07, 2014 04:02 PM UTC by Jonas Olson
Last Updated: Thu Oct 02, 2014 06:17 PM UTC
Owner: nobody
In the following example, I expect a match to be found with {'a': 'A', 'b':
''}, but no match is found.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
!/usr/bin/env python3
coding=utf-8
from pyparsing import *
text = 'ABC'
a = Literal('A')
b = oneOf(['B',''])
pattern = Combine(a.setResultsName('a') +
b.setResultsName('b') +
WordEnd('A'))
pattern.parseString(text)
Using "Optional" instead of "oneOf" yields no match either.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
!/usr/bin/env python3
coding=utf-8
from pyparsing import *
text = 'ABC'
a = Literal('A')
b = Literal('B')
pattern = Combine(a.setResultsName('a') +
Optional(b.setResultsName('b')) +
WordEnd('A'))
pattern.parseString(text)
Sent from sourceforge.net because you indicated interest in
https://sourceforge.net/p/pyparsing/bugs/75/
https://sourceforge.net/p/pyparsing/bugs/75
To unsubscribe from further messages, please visit
https://sourceforge.net/auth/subscriptions/
https://sourceforge.net/auth/subscriptions
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
Related
Bugs: #75
There we have it, possibly. I was working under the assumption that pyparsing guarantees to find a match if there is at least one interpretation of the matching pattern that matches. That's what I am used to from for example regular expressions (where
A*A
matches the string 'A') and that's what I thought was the standard way of parsing in general.This was just a minimal example I put together for the purpose of reporting what I perceived as a bug. Do you want me to post what I'm actually trying to do? It would be great to get a working example of that.
At first I thought it worked, but it seems to break when a very specific criterion is satisfied, namely that exactly one of the list elements is exactly two characters long.
Yes, please post a more complete example of what you are doing. I will have a little more time during the holidays to devote to answering pyparsing questions.