Re: [Pyparsing] Word and Regex matching more than they should
Brought to you by:
ptmcg
From: Stuart L. <st...@vr...> - 2018-01-22 05:28:47
|
On 22/01/18 10:06, Stuart Longland wrote: > Okay, something is *definitely* buggy: >> stuartl@vk4msl-ws ~/vrt/projects/widesky/sdk/hszinc $ ipython2 >> Python 2.7.14 (default, Jan 17 2018, 17:36:45) >> Type "copyright", "credits" or "license" for more information. >> >> IPython 5.4.1 -- An enhanced Interactive Python. >> ? -> Introduction and overview of IPython's features. >> %quickref -> Quick reference. >> help -> Python's own help system. >> object? -> Details about 'object', use 'object??' for extra details. >> >> In [1]: import pyparsing as pp >> >> In [2]: class Quantity(object): >> ...: def __init__(self, value, unit): >> ...: self.value = value >> ...: self.unit = unit >> ...: def __repr__(self): >> ...: return 'Q(%r, %r)' % (self.value, self.unit) >> ...: >> >> In [3]: hs_unit = pp.Regex(ur"[a-zA-Z%_/$\x80-\xffffffff]+") >> ...: hs_decimal = pp.Regex(r"-?[\d_]+(\.[\d_]+)?([eE][+\-]?[\d_]+)?").setParseAction( >> ...: lambda toks : [float(toks[0].replace('_',''))]) >> ...: hs_quantity = (hs_decimal + hs_unit).setParseAction( >> ...: lambda toks : [Quantity(toks[0], unit=toks[1])]) >> ...: >> >> In [4]: hs_quantity.parseString('123.123 abc') >> Out[4]: ([Q(123.123, 'abc')], {}) >> >> In [5]: hs_quantity.parseString('123.123 abc', parseAll=True) >> Out[5]: ([Q(123.123, 'abc')], {}) > *Nowhere*, in those patterns, is a space allowed. Yet, it passes it > through. Okay, so the magic was `leaveWhitespace`… without that, it'll silently discard whitespace in around tokens in the parser. Working around it is a tad ugly, but doable: https://github.com/vrtsystems/hszinc/commit/4b517d679dc40766340eba87660a7bdf858a68fc Regards, -- _ ___ Stuart Longland - Systems Engineer \ /|_) | T: +61 7 3535 9619 \/ | \ | 38b Douglas Street F: +61 7 3535 9699 SYSTEMS Milton QLD 4064 http://www.vrt.com.au |