Re: [Pyparsing] Word and Regex matching more than they should

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Paul,

On 22/01/18 19:17, Paul McGuire wrote:
> Stuart -
> 
> Yes, leaveWhitespace is what you need to use to suppress pyparsing's default behavior of skipping whitespace between expressions in your parser. IIRC, units was to be a trailing set of characters, with no intervening whitespace:
> 
>     # -*- coding: latin-1 -*-
> 
>     import pyparsing as pp
>     import sys
>     from itertools import filterfalse
> 
>     unicode_printables = ''.join(filterfalse(str.isspace, (chr(i) for i in range(33, sys.maxunicode))))
>     unit_chars = unicode_printables

Now that's a handy little generator snippet… I've been doing various
ugly kludges to try and generate all the code points but that is nice
and simple.

>     units = pp.Word(unit_chars)
>     numeric_value = pp.pyparsing_common.number("value") + pp.Optional(units.leaveWhitespace()("units"))
> 
>     numeric_value.runTests("""\
>        12345.6
>        12345.6mph
>        12345.6ft²
>        12345.7 mph
>     """)
> 
> Prints:
> 
>     12345.6
>     [12345.6]
>     - value: 12345.6
> 
> 
>     12345.6mph
>     [12345.6, 'mph']
>     - units: 'mph'
>     - value: 12345.6
> 
> 
>     12345.6ft²
>     [12345.6, 'ft²']
>     - units: 'ft²'
>     - value: 12345.6
> 
> 
>     12345.7 mph
>             ^
>     FAIL: Expected end of text (at char 8), (line:1, col:9)
> 
> Sorry to not have gotten back to you sooner, but it looks like you have worked this out for yourself. I had a look at your first efforts at a pyparsing parser for ZINC when you first sent this out, but when I went to look for it again, it was no longer on Github. If you can repost a working link I may be able to help you tune up your parser a bit.

No problems… while I'm on a deadline, I can understand that on this
forum, we're all more or less volunteers, hence I just kept working at
the problem.  Either someone would reply or I'd figure it out; either
way no harm is done. :-)

Prior to using `pyparsing`, that file just stored the grammar
definitions.  `pyparsing`, with the `.setParseAction` method, more or
less does nearly all of the parsing as well, so it no longer made sense
to call it "grammar", as it was more than that.  The file got renamed to
"zincparser.py".

https://github.com/vrtsystems/hszinc/blob/feature/WC-1173-add-list-support/hszinc/zincparser.py

Hopefully things are a little cleaner than my first attempt, but there's
still lots to be learned.  `pyparsing` is quite a powerful little
library, wished I had stumbled on it sooner.

I've managed to get tests to pass once again, so that's a plus.  Test
coverage fell, but that's because a lot of code was able to be thrown
out thanks to pyparsing.

https://travis-ci.org/vrtsystems/hszinc/builds/331703708

Regards,
-- 
     _ ___             Stuart Longland - Systems Engineer
\  /|_) |                           T: +61 7 3535 9619
 \/ | \ |     38b Douglas Street    F: +61 7 3535 9699
   SYSTEMS    Milton QLD 4064       http://www.vrt.com.au