Re: [Pyparsing] Using grammar as a condition for loop
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2013-11-06 12:55:12
|
You might look at one of the variations on parsing that pyparsing expressions can do. The typical parser case is one which the parser handles all the input text. It requires the most work because it has to handle everything in the input. You can also write a pyparsing parser that only matches part of the input file, and then scan or search for just those parts. I think this may be suitable for your case. Look over the following code and see how searchString and scanString return the matching lines, and how with scanString (which returns a Python generator - if you're not familiar with these, look it up), you can pull out the text between parses, since scanString returns not only the matching text, but also the start and end locations. -- Paul from pyparsing import * line_of_words = OneOrMore(Word(alphas)) inputText = """\ sldjf lskjflsja lasdfljsdf owiuerowue ndf 122 1203 080182 0123 1023021 013802 02108 aslkjweoiur olsuaperu lsfiwuer kfdsldf 293749237 029 927397 2979 29793732974 9237 82739 sjfdhhwl oewr lwkejrlj wlehrnmb 34982 9392 """ # find all groups of words using searchString for line in line_of_words.searchString(inputText): print line # prints: # ['sldjf', 'lskjflsja', 'lasdfljsdf', 'owiuerowue', 'ndf'] # ['aslkjweoiur', 'olsuaperu', 'lsfiwuer', 'kfdsldf'] # ['o'] # ['sjfdhhwl', 'oewr', 'lwkejrlj', 'wlehrnmb'] # find all groups and their start/end locations using scanString for line,start,end in line_of_words.scanString(inputText): print line # prints: # ['sldjf', 'lskjflsja', 'lasdfljsdf', 'owiuerowue', 'ndf'] # ['aslkjweoiur', 'olsuaperu', 'lsfiwuer', 'kfdsldf'] # ['o'] # ['sjfdhhwl', 'oewr', 'lwkejrlj', 'wlehrnmb'] # use scanString to associate intervening text with matched line parsedData = [] scanner = line_of_words.scanString(inputText) lastLine,lastStart,lastEnd = next(scanner) for line, start, end in scanner: parsedData.append((lastLine, inputText[lastEnd:start].splitlines())) lastLine,lastEnd = line,end # add final group after last parsed line parsedData.append((lastLine, inputText[lastEnd:].splitlines())) for line,data in parsedData: print '-', ' '.join(line) for d in data: print ' ', d # prints #- sldjf lskjflsja lasdfljsdf owiuerowue ndf # # 122 # 1203 080182 0123 1023021 013802 # 02108 # #- aslkjweoiur olsuaperu lsfiwuer kfdsldf # # 293749237 # 029 927397 2979 29793732974 # 9237 #- o # 82739 # #- sjfdhhwl oewr lwkejrlj wlehrnmb # # 34982 9392 # -----Original Message----- From: Hanchel Cheng [mailto:han...@br...] Sent: Tuesday, November 05, 2013 7:15 PM To: pyp...@li... Subject: [Pyparsing] Using grammar as a condition for loop Hello! I have a text file in a structure like this: ######start####### [line1 matching grammar] #[text] #[text] [text] [line2 matching grammar] #[text] [etc.] #######end####### There can be N amounts of lines with or without the # under each indent with a line that matches the grammar. I'm checking for the grammar, then I would like to check all the lines until the next line that follows the grammar. Something like... for line in text_file: if not(line matches grammar): do something Can pyparsing do this? If not, any suggestions? I can give more info if necessary. I really appreciate the help! Kind regards, Hanchel ---------------------------------------------------------------------------- -- November Webinars for C, C++, Fortran Developers Accelerate application performance with scalable programming models. Explore techniques for threading, error checking, porting, and tuning. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com |