Re: [Pyparsing] Supporting Spelling Variations
Brought to you by:
ptmcg
From: Paul M. <pa...@al...> - 2007-09-10 15:14:09
|
"And if you have a priori knowledge of which variation will be more common, put it first in the list." Oh, damn, that came out wrong. There is a caveat with using MatchFirst that some alternatives *must* be placed in a particular order. I learned this when I created the simple IDL parser, and had to specify an expression for the 3 parameter passing keywords, "in", "out", and "inout". I did it like this: paramPassingMechanism = Literal("in") | "out" | "inout" (The '|' operator generates MatchFirst.) Unfortunately, when actually parsing an inout parameter, the MatchFirst will match the leading "in", and "inout" will never match. This is a big part of the reason for the oneOf helper method. oneOf reorders the input strings as necessary to avoid these kinds of masking collisions. So I changed my definition to: paramPassingMechanism = oneOf("in out inout") oneOf splits the input string, and then reorders the options to match the long of two overlapping alternatives. If you print this expression out from the Python prompt, you get: Re:('inout|in|out') showing that pyparsing has created a Regex, and reordered 'inout' to be ahead of 'in'. So to avoid inadvertent masking of overlapping alternatives, I recommend using oneOf over explicit MatchFirst's. -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of Paul McGuire Sent: Monday, September 10, 2007 8:51 AM To: 'Tim Cook'; 'PyParsing List' Subject: Re: [Pyparsing] Supporting Spelling Variations Tim - Avoid Or in favor of MatchFirst when you have unambiguous alternatives like this. And if you have a priori knowledge of which variation will be more common, put it first in the list. Here are a couple of options: - Suppress( oneOf("specialize specialise") ) - MatchFirst( map(Suppress,["specialize","specialise"]) ) - Regex("speciali[sz]e").suppress() I've not had a chance to test any of these for performance, but they should be equivalent functionally. -- Paul -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of Tim Cook Sent: Monday, September 10, 2007 3:29 AM To: PyParsing List Subject: [Pyparsing] Supporting Spelling Variations Hi All, I need to support both 'specialise' and 'specialize' (actually there are many British/Australian/US spelling variations like this) My first attempts were to use either an Or list or a MatchFirst list. i.e. Suppress("specialise" Or "specialize") Each of those raised: TypeError: unsupported operand type(s) for ^: 'str' and 'str' My solution (so far) is: specialise = Suppress("specialise") specialize = Suppress("specialize") specialiseSection = (specialise | specialize + ... Is there a better/more efficient approach? Cheers, Tim -- Timothy Cook, MSc Health Informatics Research & Development Services http://timothywayne.cook.googlepages.com/home 01-904-322-8582 ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |