Thread: Re: [Pyparsing] matching a previous match
Brought to you by:
ptmcg
From: Paul M. <pa...@al...> - 2007-01-27 04:56:52
|
""" Waylan - The problem here is not parse actions, but with the behavior of scanString. It is possible to hit some false positives with scanString, especially with a grammar such as this. scanString works by walking the input string character by character, trying to match the scan expression. In your case, scanString predictably finds the first three foo's, but also finds an unexpected match at the fourth foo. This is because scanString tries each successive character location, that is, after matching the third foo ending at locn 23, scanString does: loc 24: `````foo`` -> no match loc 25: ````foo`` -> no match loc 26: ```foo`` -> no match loc 27: ``foo`` -> match! Notice the different behavior when using parseString with a grammar of OneOrMore(Group(foo)). Now there is no successive matching of each character location in turn - parseString looks for a foo match at only one location, 24. Now it should also be obvious why `bar` was matched successfully. -- Paul """ from pyparsing import * begin = Word("`") end = matchPreviousExpr(begin) foo = begin + Literal('foo') + end a = foo.scanString("``foo`` ```foo``` `foo` `````foo``") for i in a: print i # this only recognizes the first three foo's ("All the Foos down in Fooville...") print OneOrMore(Group(foo)).parseString("``foo`` ```foo``` `foo` `````foo``") -----Original Message----- From: Waylan Limberg [mailto:wa...@gm...] Sent: Thursday, January 25, 2007 12:35 PM To: Paul McGuire Subject: Re: [Pyparsing] matching a previous match On 1/25/07, Paul McGuire <pa...@al...> wrote: > Version 1.4.5 includes a couple of new helper methods, > matchPreviousLiteral and matchPreviousExpr, for exactly this > situation. I think there is some sample code in the HTML docs, if > not, write back and I'll post some sample code using these methods. Cool. Exactly what I had in mind. Unfortunately, the only place I've found any docs on it is using pydoc, but that should give me enough to get going. Thanks for the pointer. > > There is a current known bug if you are using parse actions with these > two methods, I've got a fix in the works, just need to push out the > next release if its critical for you. Hmm, I might have run into it. Here is what i have so far: >>> begin = Word("`") >>> end = matchPreviousExpr(begin) >>> foo = begin + Literal('foo') + end >>> a = foo.scanString("``foo`` ```foo``` `foo` `````foo``") print [i >>> for i in a] [((['``', 'foo', '``'], {}), 0, 7), ((['```', 'foo', '```'], {}), 8, 17), ((['`', 'foo', '`'], {}), 18, 23), ((['``', 'foo', '``'], {}), 27, 34)] That last one shouldn't match. Which would explain why the following doesn't work: >>> match = begin + SkipTo(end) + end >>> b = match.scanString("`foo` ``foo`bar`` `` `bar` ``") print [i for i >>> in b] [((['`', 'foo', '`'], {}), 0, 5), ((['`', 'foo', '`'], {}), 7, 12), ((['``', '', '``'], {}), 15, 20), ((['`', 'bar', '`'], {}), 21, 26)] > > -- Paul > > > -----Original Message----- > From: pyp...@li... > [mailto:pyp...@li...] On Behalf Of > Waylan Limberg > Sent: Thursday, January 25, 2007 11:00 AM > To: pyp...@li... > Subject: [Pyparsing] matching a previous match > > I'm trying to match an string enclosed in backticks. Rather than using > an escape character, the string is simply wrapped in more backticks. > Here are some examples: > > `foo` => foo > ``foo`bar`` => foo`bar > `` `bar` `` => `bar` #note the spaces in this one > > This is easy in regex: > > (?P<backtick>`+)(?P<string>.*?)(?P=backtick) > > Of course, the trick is that the ending string of backticks must > exactly match the opening string of backticks. > > Sure I can use Regex() for this, but I'm trying to figure you how to > do that without regex. How can I refer back to a previous match? > > -- > ---- > Waylan Limberg > wa...@gm... > > ---------------------------------------------------------------------- > --- Take Surveys. Earn Cash. Influence the Future of IT Join > SourceForge.net's Techsay panel and you'll get the chance to share > your opinions on IT & business topics through brief surveys - and earn > cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEV > DEV _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > > > -- ---- Waylan Limberg wa...@gm... |
From: Paul M. <pa...@al...> - 2007-01-27 05:05:56
|
>>>Unfortunately, the only place I've found any docs on it is using pydoc, ... I just caught this other comment of yours, do you not have the htmldoc directory as part of your pyparsing source or doc distribution? I generated this help directory using epydoc, I'd hoped it would be formatted well enough that one such as yourself could find methods such as matchPreviousXXX. Unfortunately, the win32 self-installer does not include anything else but just the basic pyparsing.py source file, so the sample code and docs get left behind. I really need to refocus on this documentation issue, it comes up more and more often. -- Paul |