Re: [Pyparsing] Word and Regex matching more than they should

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Stuart -

Yes, leaveWhitespace is what you need to use to suppress pyparsing's default behavior of skipping whitespace between expressions in your parser. IIRC, units was to be a trailing set of characters, with no intervening whitespace:

    # -*- coding: latin-1 -*-

    import pyparsing as pp
    import sys
    from itertools import filterfalse

    unicode_printables = ''.join(filterfalse(str.isspace, (chr(i) for i in range(33, sys.maxunicode))))
    unit_chars = unicode_printables
    units = pp.Word(unit_chars)
    numeric_value = pp.pyparsing_common.number("value") + pp.Optional(units.leaveWhitespace()("units"))

    numeric_value.runTests("""\
       12345.6
       12345.6mph
       12345.6ft²
       12345.7 mph
    """)

Prints:

    12345.6
    [12345.6]
    - value: 12345.6

    12345.6mph
    [12345.6, 'mph']
    - units: 'mph'
    - value: 12345.6

    12345.6ft²
    [12345.6, 'ft²']
    - units: 'ft²'
    - value: 12345.6

    12345.7 mph
            ^
    FAIL: Expected end of text (at char 8), (line:1, col:9)

Sorry to not have gotten back to you sooner, but it looks like you have worked this out for yourself. I had a look at your first efforts at a pyparsing parser for ZINC when you first sent this out, but when I went to look for it again, it was no longer on Github. If you can repost a working link I may be able to help you tune up your parser a bit.

-- Paul McGuire

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus