Re: [Pyparsing] Word and Regex matching more than they should
Brought to you by:
ptmcg
From: Stuart L. <st...@vr...> - 2018-01-22 09:34:44
|
Hi Paul, On 22/01/18 19:17, Paul McGuire wrote: > Stuart - > > Yes, leaveWhitespace is what you need to use to suppress pyparsing's default behavior of skipping whitespace between expressions in your parser. IIRC, units was to be a trailing set of characters, with no intervening whitespace: > > # -*- coding: latin-1 -*- > > import pyparsing as pp > import sys > from itertools import filterfalse > > unicode_printables = ''.join(filterfalse(str.isspace, (chr(i) for i in range(33, sys.maxunicode)))) > unit_chars = unicode_printables Now that's a handy little generator snippet… I've been doing various ugly kludges to try and generate all the code points but that is nice and simple. > units = pp.Word(unit_chars) > numeric_value = pp.pyparsing_common.number("value") + pp.Optional(units.leaveWhitespace()("units")) > > numeric_value.runTests("""\ > 12345.6 > 12345.6mph > 12345.6ft² > 12345.7 mph > """) > > Prints: > > 12345.6 > [12345.6] > - value: 12345.6 > > > 12345.6mph > [12345.6, 'mph'] > - units: 'mph' > - value: 12345.6 > > > 12345.6ft² > [12345.6, 'ft²'] > - units: 'ft²' > - value: 12345.6 > > > 12345.7 mph > ^ > FAIL: Expected end of text (at char 8), (line:1, col:9) > > Sorry to not have gotten back to you sooner, but it looks like you have worked this out for yourself. I had a look at your first efforts at a pyparsing parser for ZINC when you first sent this out, but when I went to look for it again, it was no longer on Github. If you can repost a working link I may be able to help you tune up your parser a bit. No problems… while I'm on a deadline, I can understand that on this forum, we're all more or less volunteers, hence I just kept working at the problem. Either someone would reply or I'd figure it out; either way no harm is done. :-) Prior to using `pyparsing`, that file just stored the grammar definitions. `pyparsing`, with the `.setParseAction` method, more or less does nearly all of the parsing as well, so it no longer made sense to call it "grammar", as it was more than that. The file got renamed to "zincparser.py". https://github.com/vrtsystems/hszinc/blob/feature/WC-1173-add-list-support/hszinc/zincparser.py Hopefully things are a little cleaner than my first attempt, but there's still lots to be learned. `pyparsing` is quite a powerful little library, wished I had stumbled on it sooner. I've managed to get tests to pass once again, so that's a plus. Test coverage fell, but that's because a lot of code was able to be thrown out thanks to pyparsing. https://travis-ci.org/vrtsystems/hszinc/builds/331703708 Regards, -- _ ___ Stuart Longland - Systems Engineer \ /|_) | T: +61 7 3535 9619 \/ | \ | 38b Douglas Street F: +61 7 3535 9699 SYSTEMS Milton QLD 4064 http://www.vrt.com.au |