[Pyparsing] 'Combine' behaviour when skipping whitespace
Brought to you by:
ptmcg
From: Celvin <rea...@gm...> - 2009-08-06 01:15:31
|
Hi, I recently started porting a parser for a custom file format from the Spirit framework over to pyparsing when I noticed some (at least for me) odd behavior when it comes to using Combine in expressions. I know Combine turns off whitespace skipping, probably altering some internal parser state, but it seems like skipping isn't enabled again, at least not when I would expect it. Consider the following expressions: exp = oneOf(["e", "E"]) + ZeroOrMore(oneOf(["+", "-"])) + Word(nums) frac = (ZeroOrMore(Word(nums)) + Literal(".") + Word(nums)) | (Word(nums) + Literal(".")) real_number = Combine((frac + ZeroOrMore(exp)) | (Word(nums) + exp)) Obviously, real_number is what I use to parse standard floating point values from the file. The file contains data measured by some custom hardware device and includes a header starting with initialization data, each on a separate line, that look like this: <STRING_ID>____________________ .000 .000 .000 .000 ...where <STRING_ID> is an user-defined string used as an identifier, followed by an arbitrary number of underscores and 4 floating point values denoting a spatial position in 3d space and a precision estimate. For testing purposes, I defined the following expression to parse initialization data: init_data = ZeroOrMore(Literal("%") | (Literal("<STRING_ID>") + Combine(ZeroOrMore(Literal("_"))))) + Group(real_number*4) + restOfLine Now, when I write tests using "init.data.parseString(...)" and pass the aforementioned line as parameter, I get a ParserException: Expected "." (at char 42), (line:1, col:43) ...stating that parsing failed right after the first whitespace following the first floating point value, expecting another real_number. If I change real_number to look like this: real_number = (frac + ZeroOrMore(exp)) | (Word(nums) + exp) ...thus removing the Combine, parsing is successful. Altering the init_data expression with regards to the Combine call used in that expression has no effect whatsoever. If somebody could explain this behavior, I'd be rather grateful. Regards, Celvin |