pyparsing-users Mailing List for Python parsing module (Page 4)
Brought to you by:
ptmcg
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(12) |
May
(2) |
Jun
|
Jul
|
Aug
(12) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2006 |
Jan
(5) |
Feb
(1) |
Mar
(10) |
Apr
(3) |
May
(7) |
Jun
(2) |
Jul
(2) |
Aug
(7) |
Sep
(8) |
Oct
(17) |
Nov
|
Dec
(3) |
2007 |
Jan
(4) |
Feb
|
Mar
(10) |
Apr
|
May
(6) |
Jun
(11) |
Jul
(1) |
Aug
|
Sep
(19) |
Oct
(8) |
Nov
(32) |
Dec
(8) |
2008 |
Jan
(12) |
Feb
(6) |
Mar
(42) |
Apr
(47) |
May
(17) |
Jun
(15) |
Jul
(7) |
Aug
(2) |
Sep
(13) |
Oct
(6) |
Nov
(11) |
Dec
(3) |
2009 |
Jan
(2) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
(13) |
Jul
(19) |
Aug
(17) |
Sep
(8) |
Oct
(3) |
Nov
(7) |
Dec
(1) |
2010 |
Jan
(2) |
Feb
|
Mar
(19) |
Apr
(6) |
May
|
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
(2) |
2011 |
Jan
(4) |
Feb
|
Mar
(5) |
Apr
(1) |
May
(3) |
Jun
(8) |
Jul
(6) |
Aug
(8) |
Sep
(35) |
Oct
(1) |
Nov
(1) |
Dec
(2) |
2012 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
(4) |
May
|
Jun
(1) |
Jul
|
Aug
(6) |
Sep
(18) |
Oct
|
Nov
(1) |
Dec
|
2013 |
Jan
(7) |
Feb
(7) |
Mar
(1) |
Apr
(4) |
May
|
Jun
|
Jul
(1) |
Aug
(5) |
Sep
(3) |
Oct
(11) |
Nov
(3) |
Dec
|
2014 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
(6) |
May
(10) |
Jun
(4) |
Jul
|
Aug
(5) |
Sep
(2) |
Oct
(4) |
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(13) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(9) |
Oct
(2) |
Nov
(11) |
Dec
(2) |
2016 |
Jan
|
Feb
(3) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(4) |
2017 |
Jan
(2) |
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
2018 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2023 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
From: Victor P. <po...@na...> - 2015-07-27 21:46:28
|
Probably it is a bug in pyparsing, but most probably is my misunderstanding. Note that it uses modified pyparsing with new method .addCondition() (attached). When I run my script (attached): $ ./DuplicateRefs.py chap-filt.lyx it produces output like: ['name "chap-filt"'] In my opinion, it should instead produce ['chap-filt'] because I use .suppress() in my code. Sorry, that I package the data in a separate file, not in a string, but the real example file is long. What is wrong? How to make it to produce only label name (like 'chap -filt'), not like ['name "chap-filt"']? Additional issue: Because of peculiarity of the syntax of the .lyx file (attached) I analyze, I first split it into tokens and then parse the tokens themselves (with another parser). See for example: LabelNameLineParser = \ pyparsing.Keyword("name").suppress() + pyparsing.White(" ").suppress() + \ pyparsing.Literal('"').suppress() + pyparsing.CharsNotIn('"') + pyparsing.Literal('"').suppress() LabelNameLine = Line.copy().addCondition(lambda self, loc, toks: LabelNameLineParser.parseString(toks[0], True)) Maybe, we should introduce a shorter API for tasks like this? (I am unsure whether this situation is often enough to deserve a special API.) What is your opinion? What if I will write a patch which does this? will you use it? -- Victor Porton - http://portonvictor.org |
From: Alexander B. <ale...@gm...> - 2015-05-28 11:32:28
|
Hi all, I try to use pyparsing to parse mathematics expressions with functions and variables and convert this expression in different notation. Also during conversion variables should be replaced with corresponding path using substitution dictionary. For example, input expression var1 + var2 should be transformer into binarymath(/path/to/file1, /path/to/file2, substract) Here is a bit more complex example var1 / sin(var2) + var3 should be transformed into binarymath(binarymath(/path/to/file1, sin(/path/to/file2), divide), /path/to/file3, add) I looked over examples and decided to use fourfn.py as base for my parser. To distinguish variables from functions I decide to include variables into double quotes e.g. "var1" + log10("var2"). After some investigating I come with next code import re from pyparsing import (Literal, CaselessLiteral, Word, Group, Combine, Optional, ZeroOrMore, Forward, nums, alphas, Regex, ParseException) exprStack = [] rasters = dict() def pushFirst(strg, loc, toks): exprStack.append(toks[0]) def pushUMinus(strg, loc, toks): for t in toks: if t == '-': exprStack.append('unary -') else: break def rasterPath(rasterName): global rasters return rasters[rasterName] # Define grammar bnf = None def BNF(): global bnf if not bnf: point = Literal('.') colon = Literal(',') e = CaselessLiteral('E') pi = CaselessLiteral( "PI" ) fnumber = Regex(r'[+-]?\d+(:?\.\d*)?(:?[eE][+-]?\d+)?') ident = Combine('"' + Word(alphas, alphas + nums + '_') + '"') func = Word(alphas) plus = Literal('+') minus = Literal('-') mult = Literal('*') div = Literal('/') mod = Literal('%') lpar = Literal('(').suppress() rpar = Literal(')').suppress() addop = plus | minus multop = mult | div | mod expop = Literal('^') expr = Forward() atom = ((0, None) * minus + (pi | e | fnumber | ident | func + lpar + expr + rpar | ident).setParseAction(pushFirst) | Group(lpar + expr + rpar)).setParseAction(pushUMinus) # by defining exponentiation as "atom [ ^ factor ]..." instead of # "atom [ ^ atom ]...", we get right-to-left exponents, instead of # left-to-righ that is, 2^3^2 = 2^(3^2), not (2^3)^2. factor = Forward() factor << atom + ZeroOrMore((expop + factor).setParseAction(pushFirst)) term = factor + ZeroOrMore((multop + factor).setParseAction(pushFirst)) expr << term + ZeroOrMore((addop + term).setParseAction(pushFirst)) bnf = expr return bnf # map operator symbols to corresponding arithmetic operations opn = {'+': lambda x, y: 'binarymathraster({}, {}, add)'.format(x, y), '-': lambda x, y: 'binarymathraster({}, {}, substract)'.format(x, y), '*': lambda x, y: 'binarymathraster({}, {}, times)'.format(x, y), '/': lambda x, y: 'binarymathraster({}, {}, divide)'.format(x, y), '%': lambda x, y: 'binarymathraster({}, {}, mod)'.format(x, y), '^': lambda x, y: 'binarymathraster({}, {}, power)'.format(x, y)} fn = {'sin': lambda x: 'sin({})'.format(x), 'cos': lambda x: 'cos({})'.format(x), 'tan': lambda x: 'tan({})'.format(x), 'asin': lambda x: 'asin({})'.format(x), 'acos': lambda x: 'acos({})'.format(x), 'atan': lambda x: 'atan({})'.format(x), 'log10': lambda x: 'log10({})'.format(x), 'ln': lambda x: 'ln({})'.format(x), 'abs': lambda x: 'abs({})'.format(x), 'sqrt': lambda x: 'sqrt({})'.format(x), 'ceil': lambda x: 'ceil({})'.format(x), 'floor': lambda x: 'floor({})'.format(x), 'sign': lambda x: 'sign({})'.format(x), 'sinh': lambda x: 'sinh({})'.format(x)} def evaluateStack(s): op = s.pop() if op == 'unary -': return -evaluateStack(s) if op in '+-*/^': op2 = evaluateStack(s) op1 = evaluateStack(s) return opn[op](op1, op2) elif op == 'PI': return math.pi # 3.1415926535 elif op == 'E': return math.e # 2.718281828 elif op in fn: return fn[op](evaluateStack(s)) elif re.search('\"(.+?)\"', op): return rasterPath(op.strip('"')) elif op[0].isalpha(): raise Exception('invalid identifier "%s"' % op) else: return float(op) Here is same code at pastibin http://pastebin.com/McQzfg2d. Is this correct approach? As I'm new to pyparsing I would be very thankful for feedback and suggestions how to improve this code and make it reliable and easy to extend with new functions/operators. Thanks and sorry for my English -- Alexander Bruy |
From: Paul M. <pt...@au...> - 2015-04-22 21:07:33
|
Please check what version of pyparsing you are running. I think your Mac has an older version installed. Python >>> import pyparsing >>> pyparsing.__version__ Should print 2.0.3 This latest version expands arrays of objects - older versions did not. To get the latest version of pyparsing, use easy_install or pip. -- Paul -----Original Message----- From: mlist @dslextreme.com [mailto:ml...@ds...] Sent: Wednesday, April 22, 2015 12:34 PM To: pyp...@li... Subject: [Pyparsing] Difference between Mac and Windows? I am seeing something strange. At first, I thought I was doing something wrong, but now I wonder if it is a bug. I have my script (Dropbox <https://www.dropbox.com/s/s4chl173yi4xyj8/foo.py?dl=0>) and when I run it on a windows machine *.dump() *expands everything nicely. When I run the same script on a Mac, it spews out a list *[[bunch o' data]]*. Has anyone else seen this? ---------------------------------------------------------------------------- -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com |
From: mlist @dslextreme.c. <ml...@ds...> - 2015-04-22 19:18:57
|
There is definitely a difference between the output of .dump() on Windows and Mac. I tried on Win7 and Wn 10 and Mac OS X 10.10. Here is an example of what I am seeing: *Windows Output* C:\Python27\python.exe C:/Users/qtqa/Desktop/scripts/run_moovscope.py ['Track ID', 2, 'vide', '(', 'Video', ') Enabled Not self-contained', 'Format', 'vide/avc1', 'dimensions: video', '1920x1080', 'presentation:', '1920x1080(pixelAspect+clean)', 'cleanAperture:', '1920x1080 @ 0,0 (originTopLeft)', 'MediaTimescale:600', 'Duration:3003/60000:00:05.005', 'MinSampleDuration:20/600', 'AdvanceDecodeDelta:21/600 00:00:00.035', 'Num data bytes:', 6555892, 'Est. data rate:', '10.479Mbps', 'Nominal framerate:', '29.970fps', 150, ' samples', 'Frame', ' Reordering Required', 'Included in auto selection. Language code <und>', 'Dimensions:', '1920x1080', 'CleanAperture: ', '1920x1080', 'ProductionAperture:', '1920x1080', 'EncodedPixels:', '1920x1080', 'Track Matrix:', ' 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0', 1, 'edit:', ['Media start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], 'Track start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], '']] - CleanAperture: 1920x1080 - Dimensions: 1920x1080 - EncodedPixels: 1920x1080 - ProductionAperture: 1920x1080 - TrackMatrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 - cleanAperture: 1920x1080 @ 0,0 (originTopLeft) - data_bytes: 6555892 - decode_delta: 21/600 00:00:00.035 - edits: [0]: ['Media start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], 'Track start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], ''] - MediaDuration: ['3000/600', '00:00:05.000'] - MediaStart: ['0/600', '00:00:00.000'] - TrackDuration: ['3000/600', '00:00:05.000'] - TrackStart: ['0/600', '00:00:00.000'] - estimated_data_rate: 10.479Mbps - fps: 29.970fps - number_of_edits: 1 - presentation: 1920x1080(pixelAspect+clean) - sample_duration: 20/600 - samples: 150 - track_format: vide/avc1 - type: Video - video: 1920x1080 Process finished with exit code 0 *Mac Output* /System/Library/Frameworks/Python.framework/Versions/2.7/bin/python2.7 /Users/test/Desktop/PlaybackBaseline/run_moovscope.py ['Track ID', 2, 'vide', '(', 'Video', ') Enabled Not self-contained', 'Format', 'vide/avc1', 'dimensions: video', '1920x1080', 'presentation:', '1920x1080(pixelAspect+clean)', 'cleanAperture:', '1920x1080 @ 0,0 (originTopLeft)', 'MediaTimescale:600', 'Duration:3003/60000:00:05.005', 'MinSampleDuration:20/600', 'AdvanceDecodeDelta:21/600 00:00:00.035', 'Num data bytes:', 6555892, 'Est. data rate:', '10.479Mbps', 'Nominal framerate:', '29.970fps', 150, ' samples', 'Frame', ' Reordering Required', 'Included in auto selection. Language code <und>', 'Dimensions:', '1920x1080', 'CleanAperture: ', '1920x1080', 'ProductionAperture:', '1920x1080', 'EncodedPixels:', '1920x1080', 'Track Matrix:', ' 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0', 1, 'edit:', ['Media start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], 'Track start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], '']] - CleanAperture: 1920x1080 - Dimensions: 1920x1080 - EncodedPixels: 1920x1080 - ProductionAperture: 1920x1080 - TrackMatrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 - cleanAperture: 1920x1080 @ 0,0 (originTopLeft) - data_bytes: 6555892 - decode_delta: 21/600 00:00:00.035 - edits: [['Media start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], 'Track start', ['0/600', '00:00:00.000'], 'dur', ['3000/600', '00:00:05.000'], '']] - estimated_data_rate: 10.479Mbps - fps: 29.970fps - number_of_edits: 1 - presentation: 1920x1080(pixelAspect+clean) - sample_duration: 20/600 - samples: 150 - track_format: vide/avc1 - type: Video - video: 1920x1080 Process finished with exit code 0 |
From: mlist @dslextreme.c. <ml...@ds...> - 2015-04-22 17:33:42
|
I am seeing something strange. At first, I thought I was doing something wrong, but now I wonder if it is a bug. I have my script (Dropbox <https://www.dropbox.com/s/s4chl173yi4xyj8/foo.py?dl=0>) and when I run it on a windows machine *.dump() *expands everything nicely. When I run the same script on a Mac, it spews out a list *[[bunch o' data]]*. Has anyone else seen this? |
From: Robin S. <rob...@ds...> - 2015-04-22 04:22:20
|
I have grammar that works for almost all of my file now. However, there is still one block I am having trouble with. There are 3 blocks that are similar: audio track, video track and subtitle track. For some reason, I can't get the correct grammar for the subtitle block. What am I doing wrong? Sample Text Track ID 1 soun (Audio) Enabled Not self-contained Format soun/aac 48000 Hz aac FormatFlags: 0x00000000 Bytes/Pkt: 0 Frames/Pkt: 1024 Bytes/Frame: 0 Chan/Frame: 2 Bits/Chan: 0 Reserved: 0x00000000 ChannelLayout: Stereo (L R) Media Timescale: 48000 Duration: 240640/48000 00:00:05.013 MinSampleDuration: 1024/48000 AdvanceDecodeDelta: 0/48000 00:00:00.000 Num data bytes: 80213 Est. data rate: 127.999 kbps Nominal framerate: 46.875 fps 235 samples Track volume: 1 Included in auto selection. Language code <und> Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/48000 00:00:00.000 dur 3000/600 00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track ID 2 vide (Video) Enabled Not self-contained Format vide/avc1 dimensions: video 1920 x 1080, presentation: 1920 x 1080 (pixelAspect+clean), cleanAperture: 1920 x 1080 @ 0,0 (originTopLeft) Media Timescale: 600 Duration: 3003/600 00:00:05.005 MinSampleDuration: 20/600 AdvanceDecodeDelta: 21/600 00:00:00.035 Num data bytes: 6555892 Est. data rate: 10.479 Mbps Nominal framerate: 29.970 fps 150 samples Frame Reordering Required Included in auto selection. Language code <und> Dimensions: 1920 x 1080 CleanAperture: 1920 x 1080 ProductionAperture: 1920 x 1080 EncodedPixels: 1920 x 1080 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track ID 8 sbtl (Subtitles) Enabled Not self-contained Format sbtl/tx3g DisplayFlags: 0x00000000 Just: 0 H 0 V Default Text Box: 0 x 0 @ 0, 0 Default Style: Local fontID 1 Size -1 Color (RGBA): 1 1 1 1 Font Name: Arial Media Timescale: 600 Duration: 4800/600 00:00:08.000 MinSampleDuration: 1200/600 AdvanceDecodeDelta: 0/600 00:00:00.000 Num data bytes: 226 Est. data rate: 0.226 kbps Nominal framerate: 0.375 fps 3 samples Included in auto selection. Language code <und> Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 2 edits: Media start INVALID TIME dur 6000/600 00:00:10.000 Track start 0/600 00:00:00.000 dur 6000/600 00:00:10.000 (EMPTY EDIT) Media start 1002/600 00:00:01.670 dur 3000/600 00:00:05.000 Track start 6000/600 00:00:10.000 dur 3000/600 00:00:05.000 The full code is in my dropbox - https://www.dropbox.com/s/s4chl173yi4xyj8/foo.py?dl=0 Here is the relevant code: # Define Track Info Block # Audio Block self.audio_track_info = Group(self.crap + self.track_id + self.audio_track_format + self.channel_layout + self.media_timescale + self.track_data + self.audio_track_volume + self.included + self.audio_track_dimensions + OneOrMore(self.edits)) # Subtitle Block self.subtitle_track_info = Group(self.crap + self.track_id + self.subtitle_track_format) + \ self.media_timescale + self.track_data + self.included + \ self.subtitle_track_dimensions + OneOrMore(self.edits) # Video Block self.video_track_info = Group(self.crap + self.track_id + self.video_track_format + self.media_timescale + self.track_data + self.frame + self.included + self.video_track_dimensions + OneOrMore(self.edits)) self.tracks = OneOrMore(self.audio_track_info | self.video_track_info | self.subtitle_track_info).setResultsName('tracks') -- Sent with Postbox <http://www.getpostbox.com> |
From: Robin S. <ml...@ds...> - 2015-04-22 04:12:58
|
I was able to revert to a previous save and everything displays correctly now. :) > mlist @dslextreme.com <mailto:ml...@ds...> > Tuesday, April 21, 2015 3:30 PM > For some reason pastern says my paste has been removed, so I am > sharing it on Dropbox > <https://www.dropbox.com/s/s4chl173yi4xyj8/foo.py?dl=0> > > > mlist @dslextreme.com <mailto:ml...@ds...> > Tuesday, April 21, 2015 11:25 AM > Thanks for all the help with my previous problem. Now I almost have > it parsing everything I want, but I have a display problem. When I > print result.track.dump(), instead of everything printing out nicely > in a hierarchy, everything prints out as a list, i.e. [[ bunch o' > data]]. I *had* it displaying correctly before, so I am sure that it > has something to do with grouping or ResultNames. However, I haven't > been able to fix the problem via trial and error. > > Since the script is rather long now, I pasted it to my pastern - > Pastebin - pyparsing script <http://pastebin.com/u8XVEtxt> -- Sent with Postbox <http://www.getpostbox.com> |
From: mlist @dslextreme.c. <ml...@ds...> - 2015-04-21 22:30:21
|
For some reason pastern says my paste has been removed, so I am sharing it on Dropbox <https://www.dropbox.com/s/s4chl173yi4xyj8/foo.py?dl=0> On Tue, Apr 21, 2015 at 11:25 AM, mlist @dslextreme.com < ml...@ds...> wrote: > Thanks for all the help with my previous problem. Now I almost have it > parsing everything I want, but I have a display problem. When I print > result.track.dump(), instead of everything printing out nicely in a > hierarchy, everything prints out as a list, i.e. [[ bunch o' data]]. I > *had* it displaying correctly before, so I am sure that it has something > to do with grouping or ResultNames. However, I haven't been able to fix > the problem via trial and error. > > Since the script is rather long now, I pasted it to my pastern - Pastebin > - pyparsing script <http://pastebin.com/u8XVEtxt> > |
From: mlist @dslextreme.c. <ml...@ds...> - 2015-04-21 18:25:27
|
Thanks for all the help with my previous problem. Now I almost have it parsing everything I want, but I have a display problem. When I print result.track.dump(), instead of everything printing out nicely in a hierarchy, everything prints out as a list, i.e. [[ bunch o' data]]. I *had* it displaying correctly before, so I am sure that it has something to do with grouping or ResultNames. However, I haven't been able to fix the problem via trial and error. Since the script is rather long now, I pasted it to my pastern - Pastebin - pyparsing script <http://pastebin.com/u8XVEtxt> |
From: Paul M. <pt...@au...> - 2015-04-21 07:14:35
|
I agree with John's suggestion - define an overall grammar that is simply OneOrMore(video_block | audio_block), and then use that to parse the mixed listing of blocks. The other way in pyparsing to treat a grammar like "A, B, C and D, in any order" is to use the Each construct in pyparsing, created using the '&' operator. So "A & B & C & D" will match the 4 elements in any order, but all 4 must be present. If some are optional, then indicate them so using Optional, as in "A & B & Optional(C) & D". Finally, if some items might appear more than once, use ZeroOrMore or OneOrMore, as in "OneOrMore(A) & B & ZeroOrMore(C) & D". This last expression will match the repeated elements even if they are not all together, so AABADBACC would match, as would DABAAA. ABC would *not* match, as both B and D elements are required. But for your particular case, I think just OneOrMore(A | B) should be sufficient for any combination of A's and B's. I'm glad to see some of the other pyparsing folks stepping up to answer some questions, here and on StackOverflow. And thanks, John, for your kind comments on the help you get on the wikispaces site. As it turns out, between work and family activities for the next month or so, my participation on these lists will be limited, so I appreciate other experienced pyparsing users helping the new folks. Cheers, -- Paul -----Original Message----- From: john grant [mailto:joh...@ya...] Sent: Tuesday, April 21, 2015 12:41 AM To: mlist @dslextreme.com; pyp...@li... Subject: Re: [Pyparsing] Dealing with blocks in different order I've been stuck on the same thing. I think I know the answer, but I have not had time to verify it. If you find one of the examples for parsing a C structure, I think it must contain the secret because the parser works no matter the order of the structure members. If my memory is correct, that example has parsers defined for each type of member (e.g. single variable, array, pointer, etc), and then there is a single parser that ORs each of the other parsers together, and the matches get inserted into a container (i.e. OneOrMore). Let me know if you find a fix! FYI: I've had great support on the wikispaces site. Paul has answered many of my questions. However, finding that support channel was insanely difficult. You have to go to the home page, then click Getting Help in the left side menu, then in the body of the page that appears, click the text that is underlined/hyperlinked saying "pyparsing home page" (which is poorly named). -John --- This email has been checked for viruses by Avast antivirus software. http://www.avast.com |
From: Hans M. <han...@gm...> - 2015-04-21 06:58:58
|
Hi, Am 21.04.2015 um 07:41 schrieb john grant <joh...@ya...>: > I've been stuck on the same thing. I think I know the answer, … that example has parsers defined for each type of member (e.g. single variable, array, pointer, etc), and then there is a single parser that ORs each of the other parsers together, and the matches get inserted into a container (i.e. OneOrMore). Exactly that is how I would express it: OneOrMore(Video | Audio) Or is there anything I am missing? Best regards, Hans |
From: john g. <joh...@ya...> - 2015-04-21 05:41:22
|
I've been stuck on the same thing. I think I know the answer, but I have not had time to verify it. If you find one of the examples for parsing a C structure, I think it must contain the secret because the parser works no matter the order of the structure members. If my memory is correct, that example has parsers defined for each type of member (e.g. single variable, array, pointer, etc), and then there is a single parser that ORs each of the other parsers together, and the matches get inserted into a container (i.e. OneOrMore). Let me know if you find a fix! FYI: I've had great support on the wikispaces site. Paul has answered many of my questions. However, finding that support channel was insanely difficult. You have to go to the home page, then click Getting Help in the left side menu, then in the body of the page that appears, click the text that is underlined/hyperlinked saying "pyparsing home page" (which is poorly named). -John On Monday, April 20, 2015 12:37 PM, "mlist @dslextreme.com" <ml...@ds...> wrote: I have everything I need to parse a file defined, but I am running in to a problem. The main part of the file consists of different blocks (see below) and the blocks can be in different order or not even there. I am not sure how to deal with this. For example, it could be [video block] [audio block] or [audio block] [video block] or [video block] [audio block] [audio block] or any permutation of the above. Here is the sample text Track ID 1 soun (Audio) Enabled Not self-contained Format soun/aac 48000 Hz aac FormatFlags: 0x00000000 Bytes/Pkt: 0 Frames/Pkt: 1024 Bytes/Frame: 0 Chan/Frame: 2 Bits/Chan: 0 Reserved: 0x00000000 ChannelLayout: Stereo (L R) Media Timescale: 48000 Duration: 240640/48000 00:00:05.013 MinSampleDuration: 1024/48000 AdvanceDecodeDelta: 0/48000 00:00:00.000 Num data bytes: 80213 Est. data rate: 127.999 kbps Nominal framerate: 46.875 fps 235 samples Track volume: 1 Included in auto selection. Language code <und> Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/48000 00:00:00.000 dur 3000/600 00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track ID 2 vide (Video) Enabled Not self-contained Format vide/avc1 dimensions: video 1920 x 1080, presentation: 1920 x 1080 (pixelAspect+clean), cleanAperture: 1920 x 1080 @ 0,0 (originTopLeft) Media Timescale: 600 Duration: 3003/600 00:00:05.005 MinSampleDuration: 20/600 AdvanceDecodeDelta: 21/600 00:00:00.035 Num data bytes: 6555892 Est. data rate: 10.479 Mbps Nominal framerate: 29.970 fps 150 samples Frame Reordering Required Included in auto selection. Language code <und> Dimensions: 1920 x 1080 CleanAperture: 1920 x 1080 ProductionAperture: 1920 x 1080 EncodedPixels: 1920 x 1080 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 And here is what my code looks like: # Audio Block self.audio_track_info = Group( self.track_id + self.audio_track_format + self.channel_layout + self.media_timescale + self.track_data + self.audio_track_volume + self.included + self.audio_track_dimensions + OneOrMore(self.edits).setResultsName('edits')).setResultsName('audio_track') # Video Block self.video_track_info = Group(self.track_id + self.video_track_format + self.media_timescale + self.track_data + self.included + self.video_track_dimensions + OneOrMore(self.edits).setResultsName('edits')).setResultsName('video_track') self.tracks = Group(ZeroOrMore(self.video_track_info)) + Group(ZeroOrMore(self.audio_track_info)).setResultsName('tracks') I am using parseString on the text and this only returns the audio block and not the video block. How do I fix this? ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: mlist @dslextreme.c. <ml...@ds...> - 2015-04-20 19:37:22
|
I have everything I need to parse a file defined, but I am running in to a problem. The main part of the file consists of different blocks (see below) and the blocks can be in different order or not even there. I am not sure how to deal with this. For example, it could be [video block] [audio block] or [audio block] [video block] or [video block] [audio block] [audio block] or any permutation of the above. Here is the sample text Track ID 1 soun (Audio) Enabled Not self-contained Format soun/aac 48000 Hz aac FormatFlags: 0x00000000 Bytes/Pkt: 0 Frames/Pkt: 1024 Bytes/Frame: 0 Chan/Frame: 2 Bits/Chan: 0 Reserved: 0x00000000 ChannelLayout: Stereo (L R) Media Timescale: 48000 Duration: 240640/48000 00:00:05.013 MinSampleDuration: 1024/48000 AdvanceDecodeDelta: 0/48000 00:00:00.000 Num data bytes: 80213 Est. data rate: 127.999 kbps Nominal framerate: 46.875 fps 235 samples Track volume: 1 Included in auto selection. Language code <und> Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/48000 00:00:00.000 dur 3000/600 00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track ID 2 vide (Video) Enabled Not self-contained Format vide/avc1 dimensions: video 1920 x 1080, presentation: 1920 x 1080 (pixelAspect+clean), cleanAperture: 1920 x 1080 @ 0,0 (originTopLeft) Media Timescale: 600 Duration: 3003/600 00:00:05.005 MinSampleDuration: 20/600 AdvanceDecodeDelta: 21/600 00:00:00.035 Num data bytes: 6555892 Est. data rate: 10.479 Mbps Nominal framerate: 29.970 fps 150 samples Frame Reordering Required Included in auto selection. Language code <und> Dimensions: 1920 x 1080 CleanAperture: 1920 x 1080 ProductionAperture: 1920 x 1080 EncodedPixels: 1920 x 1080 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 Track start 0/600 00:00:00.000 dur 3000/600 00:00:05.000 And here is what my code looks like: # Audio Block self.audio_track_info = Group( self.track_id + self.audio_track_format + self.channel_layout + self.media_timescale + self.track_data + self.audio_track_volume + self.included + self.audio_track_dimensions + OneOrMore(self.edits).setResultsName('edits')).setResultsName('audio_track') # Video Block self.video_track_info = Group(self.track_id + self.video_track_format + self.media_timescale + self.track_data + self.included + self.video_track_dimensions + OneOrMore(self.edits).setResultsName('edits')).setResultsName('video_track') self.tracks = Group(ZeroOrMore(self.video_track_info)) + Group(ZeroOrMore(self.audio_track_info)).setResultsName('tracks') I am using parseString on the text and this only returns the audio block and not the video block. How do I fix this? |
From: Robin S. <rob...@ds...> - 2015-04-20 00:29:48
|
I am new to this and I am having trouble figuring out how to parse the below text. Much of the data in the 2 blocks below are similar. There is one crucial keyword that will differentiate the 2 blocks: the 1st line will either contain "vide (Video)" or "soun (Audio)". I can't figure out how to determine which one is present in the first line and then how to say 'If it is "vide" do this, otherwise do that'. Track ID 1 vide (Video) Enabled Self-contained Format vide/avc1 dimensions: video 1920 x 1080, presentation: 1920 x 1080 (pixelAspect+clean), cleanAperture: 1920 x 1080 @ 0,0 (originTopLeft) Media Timescale: 600 Duration: 14076/600 00:00:23.460 MinSampleDuration: 10/600 AdvanceDecodeDelta: 0/600 00:00:00.000 Num data bytes: 91412686 Est. data rate: 31.172 Mbps Nominal framerate: 59.974 fps 1407 samples Included in auto selection. Language code<und> Dimensions: 1920 x 1080 CleanAperture: 1920 x 1080 ProductionAperture: 1920 x 1080 EncodedPixels: 1920 x 1080 Track Matrix: 0.0 1.0 0.0 / -1.0 0.0 0.0 / 1080.0 0.0 1.0 1 edit: Media start 0/600 00:00:00.000 dur 14076/600 00:00:23.460 Track start 0/600 00:00:00.000 dur 14076/600 00:00:23.460 Track ID 2 soun (Audio) Enabled Self-contained Format soun/aac 44100 Hz aac FormatFlags: 0x00000000 Bytes/Pkt: 0 Frames/Pkt: 1024 Bytes/Frame: 0 Chan/Frame: 1 Bits/Chan: 0 Reserved: 0x00000000 ChannelLayout: Mono Media Timescale: 44100 Duration: 1037312/44100 00:00:23.522 MinSampleDuration: 1024/44100 AdvanceDecodeDelta: 0/44100 00:00:00.000 Num data bytes: 184508 Est. data rate: 62.753 kbps Nominal framerate: 43.066 fps 1013 samples Track volume: 1 Included in auto selection. Language code<und> Dimensions: 0 x 0 Track Matrix: 1.0 0.0 0.0 / 0.0 1.0 0.0 / 0.0 0.0 1.0 1 edit: Media start 0/44100 00:00:00.000 dur 14076/600 00:00:23.460 Track start 0/600 00:00:00.000 dur 14076/600 00:00:23.460 -- Sent with Postbox <http://www.getpostbox.com> |
From: Moti Ben-A. <mot...@gm...> - 2015-04-02 08:21:33
|
Hello, I am working with a system that has its own simple programming language, but I would like to construct a preprocessor that will translate a small subset of Python into this language. Can I use pyparsing to do this? I found pythonGrammarParsing.py which seems that it might help, but I couldn't find an explanation of how to use the bnfDefs to actually parse a Python program. Thanks Moti |
From: Dropbox <no-...@dr...> - 2014-11-19 00:39:56
|
Hi there, Mario Osorio wants you to try Dropbox! Dropbox lets you bring all your photos, docs, and videos with you anywhere and share them easily. Get started here. https://www.dropbox.com/l/1LilMq6g5aqT3DhZc7w0dr?text=1 Thanks! - The Dropbox Team ____________________________________________________ To stop receiving invites from Dropbox, please go to https://www.dropbox.com/l/KlWHZuuQBdI7o9cDWTtWpj?text=1 Dropbox, Inc., PO Box 77767, San Francisco, CA 94107 |
From: Dan L. <dl...@gm...> - 2014-10-27 19:09:06
|
Daniel Lenski <dlenski <at> gmail.com> writes: > So I think I'm going to have to do incremental parsing in order to get > reasonably fast feedback from the parser. Do you have any suggestions for > how to do this? I'm trying to figure out if there's a good way to do greedy > consumption of trailing (whitespace|comments) at the end of each valid > top-level element. I modified parseString very slightly and came up with parseConsumeString(). This version calls self._parse() followed by self.preParse() repeatedly to do what I want when self is the parser for a "top-level" item. def parseConsumeString(self, instring, parseAll=False, yieldLoc=True, loopResetCache=False): if not loopResetCache: ParserElement.resetCache() if not self.streamlined: self.streamline() #~ self.saveAsList = True for e in self.ignoreExprs: e.streamline() if not self.keepTabs: instring = instring.expandtabs() try: loc = 0 while loc<len(instring): sloc = loc if loopResetCache: ParserElement.resetCache() loc, tokens = self._parse(instring, loc) if yieldLoc: yield tokens, sloc, loc else: yield tokens loc = self.preParse(instring, loc) except ParseBaseException as exc: if not parseAll: return if ParserElement.verbose_stacktrace: raise else: # catch and re-raise exception from here, clears out pyparsing internal stack trace raise exc By the way, I moved ParserElement.resetCache() into the loop, in order to drastically reduce memory consumption with packrat caching. Memory consumption goes down from around 6G peak to around 100M peak, while running about 15-20% faster. This is on a Core i7 980X with 8G of RAM, Win7. In [1]: import my_parser_module as P In [2]: sample=open("large_file).read() In [3]: len(sample) Out[3]: 9153816 In [4]: %timeit -n1 for r in P.parseConsumeString(P.TopLevel.ignore(P.COMMENT), sample, True, True, True): pass 1 loops, best of 3: 1min 10s per loop In [6]: %timeit -n1 for r in P.parseConsumeString(P.TopLevel.ignore(P.COMMENT), sample, True, True, False): pass 1 loops, best of 3: 1min 22s per loop Thanks, Dan |
From: <pt...@au...> - 2014-10-27 16:33:24
|
Before you go too far down this path, try enabling packrat parsing, which should help both performance and memory footprint. Right after importing pyparsing, add this line: ParserElement.enablePackrat() -- Paul ---- Dan Lenski <dl...@gm...> wrote: > I'm using PyParsing to parse some rather large text files with a C-like > format (braces and semicolons and all that). PyParsing works just great, > but it is slow and consumes a very large amount of memory due to the > size of my files. > |
From: Daniel L. <dl...@gm...> - 2014-10-27 16:33:01
|
Thanks for the quick response, Paul. With a 10 MiB input file of which no top-level element is longer than ~10 kB, it takes about 5 GiB of memory and 5 minutes before parseString() starts returning results. I tried enablePackrat() and memory usage is somewhat higher but speed is not appreciably improved. Based on the docstring, I wouldn't expect enablePackrat to make a big improvement, since every lengthy block in the grammar I'm trying to parse is introduced with a unique keyword, so I don't think there's much backtracking-and-reparsing. So I think I'm going to have to do incremental parsing in order to get reasonably fast feedback from the parser. Do you have any suggestions for how to do this? I'm trying to figure out if there's a good way to do greedy consumption of trailing (whitespace|comments) at the end of each valid top-level element. -Dan On Mon, Oct 27, 2014 at 9:17 AM, <pt...@au...> wrote: > Before you go too far down this path, try enabling packrat parsing, which > should help both performance and memory footprint. > > Right after importing pyparsing, add this line: > > ParserElement.enablePackrat() > > > -- Paul > > > ---- Dan Lenski <dl...@gm...> wrote: > > I'm using PyParsing to parse some rather large text files with a C-like > > format (braces and semicolons and all that). PyParsing works just great, > > but it is slow and consumes a very large amount of memory due to the > > size of my files. > > > |
From: Dan L. <dl...@gm...> - 2014-10-27 16:05:18
|
I'm using PyParsing to parse some rather large text files with a C-like format (braces and semicolons and all that). PyParsing works just great, but it is slow and consumes a very large amount of memory due to the size of my files. I wanted to try to implement an incremental parsing approach wherein I'd parse the top-level elements of the source file one-by-one. The scanString method seems like the obvious way to do this. However, I want to make sure that there is no invalid/unparseable text in-between the sections parsed by scanString, and can't figure out a good way to do this. Here's a simplified example that shows the problem I'm having: sample="""f1(1,2,3); f2_no_args( ); # comment out: foo(4,5,6); bar(7,8); this should be an error; baz(9,10); """ from pyparsing import * COMMENT=Suppress('#' + restOfLine()) SEMI,COMMA,LPAREN,RPAREN = map(Suppress,';,()') ident = Word(alphas, alphanums+"_") integer = Word(nums+"+-",nums) statement = ident("fn") + LPAREN + Group(Optional(delimitedList(integer)))("arguments") + RPAREN + SEMI p = statement.ignore(COMMENT) for res, start, end in p.scanString(sample): print "***** (%d,%d)" % (start, end) print res.dump() When I run this, the ranges returned by scanString are discontinguous due to unparsed text between them ((0,10),(11,25),(53,62),(88,98)). Two of these gaps are whitespace or comments, which should not trigger an error, but one of them (`this should be an error;`) contains unparse- able text, which I want to catch. Is there a way to use pyparsing to parse a file incrementally while still ensuring that the entire input could be parsed with the specified parser grammar? Perhaps it is possible to make scanString "greedy" so that it parses valid whitespace or comments following each range? If so, that would help me resolve this issue since it would ensure that gaps only occur between the returned ranges when there's an error. Thanks, Dan Lenski PS-I found this related thread on the discussion board, but there doesn't appear to be a resolution for this issue in it: http://pyparsing.wikispaces.com/share/view/30891763 |
From: Glenn P. <gle...@gm...> - 2014-09-04 10:00:49
|
Hmm I got it going with this point = Literal( "." ) e = CaselessLiteral( "E" ) fnumber = Combine( Word( "+-"+nums, nums ) + Optional( point + Optional( Word( nums ) ) ) + Optional( e + Word( "+-"+nums, nums ) ) ) ident = Word(alphas, alphas+nums+"_$") plus = Literal( "+" ) minus = Literal( "-" ) mult = Literal( "*" ) div = Literal( "/" ) lte = Literal( ">=" ) lt = Literal( "<" ) gt = Literal( ">" ) anding = CaselessLiteral( "and" ) oring = CaselessLiteral( "or" ) lpar = Literal( "(" ).suppress() rpar = Literal( ")" ).suppress() addop = plus | minus | lte | lt | gt multop = mult | div logicalop = anding | oring expop = Literal( "^" ) pi = CaselessLiteral( "PI" ) expr = Forward() atom = ((Optional(oneOf("- +")) + (pi|e|fnumber|ident+lpar+expr+rpar).setParseAction(self.pushFirst)) | Optional(oneOf("- +")) + Group(lpar+expr+rpar) ).setParseAction(self.pushUMinus) # by defining exponentiation as "atom [ ^ factor ]..." instead of # "atom [ ^ atom ]...", we get right-to-left exponents, instead of left-to-right # that is, 2^3^2 = 2^(3^2), not (2^3)^2. factor = Forward() factor <<= atom + ZeroOrMore( ( expop + factor ).setParseAction( self.pushFirst ) ) term = factor + ZeroOrMore( ( multop + factor ).setParseAction( self.pushFirst ) ) term2 = term + ZeroOrMore( ( addop + term ).setParseAction( self.pushFirst ) ) expr <<= term2 + ZeroOrMore( ( logicalop + term2 ).setParseAction( self.pushFirst ) ) Can't say I 100% understand the grammer though. Any short explanations would be great. On 4 September 2014 10:23, Glenn Pierce <gle...@gm...> wrote: > Hi I am new to pyparsing and I am having trouble understanding the grammer > of the example fourFn.py > <http://pyparsing.wikispaces.com/file/view/fourFn.py/30154950/fourFn.py> > > I am trying to add comparison operators and logical operators to this > example. The comparisons work ok but I not sure how to add the logical > operators AND OR NOT > > So far I have this > > plus = Literal( "+" ) > minus = Literal( "-" ) > mult = Literal( "*" ) > div = Literal( "/" ) > lte = Literal( ">=" ) > lt = Literal( "<" ) > gt = Literal( ">" ) > anding = CaselessLiteral( "and" ) > lpar = Literal( "(" ).suppress() > rpar = Literal( ")" ).suppress() > addop = plus | minus | lte | lt | gt > multop = mult | div > expop = Literal( "^" ) > pi = CaselessLiteral( "PI" ) > sensor_values = CaselessLiteral( "sensor_values" ) > expr = Forward() > atom = ((Optional(oneOf("- +")) + > > (sensor_values|pi|e|fnumber|ident+lpar+expr+rpar).setParseAction(self.pushFirst)) > | Optional(oneOf("- +")) + Group(lpar+expr+rpar) > ).setParseAction(self.pushUMinus) > # by defining exponentiation as "atom [ ^ factor ]..." instead of > # "atom [ ^ atom ]...", we get right-to-left exponents, instead of > left-to-right > # that is, 2^3^2 = 2^(3^2), not (2^3)^2. > factor = Forward() > factor <<= atom + ZeroOrMore( ( expop + factor ).setParseAction( > self.pushFirst ) ) > term = factor + ZeroOrMore( ( multop + factor ).setParseAction( > self.pushFirst ) ) > expr <<= term + ZeroOrMore( ( addop + term ).setParseAction( > self.pushFirst ) ) > > > Could someone perhaps explain the three lines below and how I would add my > CaselessLiteral( "and" ) boolean > so I could do something like > nsp.eval('10 >= 5 AND 10 < 15') > > factor <<= atom + ZeroOrMore( ( expop + factor ).setParseAction( > self.pushFirst ) ) > term = factor + ZeroOrMore( ( multop + factor ).setParseAction( > self.pushFirst ) ) > expr <<= term + ZeroOrMore( ( addop + term ).setParseAction( > self.pushFirst ) ) > > > PS what is the <<= operator ? I can't find the docs for it > > Thanks for any help. > |
From: Glenn P. <gle...@gm...> - 2014-09-04 09:23:39
|
Hi I am new to pyparsing and I am having trouble understanding the grammer of the example fourFn.py <http://pyparsing.wikispaces.com/file/view/fourFn.py/30154950/fourFn.py> I am trying to add comparison operators and logical operators to this example. The comparisons work ok but I not sure how to add the logical operators AND OR NOT So far I have this plus = Literal( "+" ) minus = Literal( "-" ) mult = Literal( "*" ) div = Literal( "/" ) lte = Literal( ">=" ) lt = Literal( "<" ) gt = Literal( ">" ) anding = CaselessLiteral( "and" ) lpar = Literal( "(" ).suppress() rpar = Literal( ")" ).suppress() addop = plus | minus | lte | lt | gt multop = mult | div expop = Literal( "^" ) pi = CaselessLiteral( "PI" ) sensor_values = CaselessLiteral( "sensor_values" ) expr = Forward() atom = ((Optional(oneOf("- +")) + (sensor_values|pi|e|fnumber|ident+lpar+expr+rpar).setParseAction(self.pushFirst)) | Optional(oneOf("- +")) + Group(lpar+expr+rpar) ).setParseAction(self.pushUMinus) # by defining exponentiation as "atom [ ^ factor ]..." instead of # "atom [ ^ atom ]...", we get right-to-left exponents, instead of left-to-right # that is, 2^3^2 = 2^(3^2), not (2^3)^2. factor = Forward() factor <<= atom + ZeroOrMore( ( expop + factor ).setParseAction( self.pushFirst ) ) term = factor + ZeroOrMore( ( multop + factor ).setParseAction( self.pushFirst ) ) expr <<= term + ZeroOrMore( ( addop + term ).setParseAction( self.pushFirst ) ) Could someone perhaps explain the three lines below and how I would add my CaselessLiteral( "and" ) boolean so I could do something like nsp.eval('10 >= 5 AND 10 < 15') factor <<= atom + ZeroOrMore( ( expop + factor ).setParseAction( self.pushFirst ) ) term = factor + ZeroOrMore( ( multop + factor ).setParseAction( self.pushFirst ) ) expr <<= term + ZeroOrMore( ( addop + term ).setParseAction( self.pushFirst ) ) PS what is the <<= operator ? I can't find the docs for it Thanks for any help. |
From: Andreas M. <and...@gm...> - 2014-08-29 15:20:25
|
Athanasios Anastasiou wrote: > 1) Do you really need your parsed string to be enclosed in three single > quotes? Actually I'm parsing a file with parseFile(). > 2) The error you are getting is reasonable. The rule p expects to start at > the beginning of the parsed string. Be defining g as 'b'+p, you are placing > rule p AFTER the beginning of the line (which is where the character 'b' > will be). But the string consists of 3 line and for the second line I created a parser with LineStart() and LineEnd(). That's why I'm using LineStart() to get the beginning of the second line. Is this wrong? What puzzles me is that the parser yields the correct results, as long as no actions are attached to it. At least it's the result I expected. But after adding setParseAction() it is no longer working. Maybe I should explain my original case: I try to parse a file which has /one/ particular line containing a few numbers separated by spaces. I do not know how many numbers there are in this line. I just know that they are in this particular line. Thus I need to take care about linebreak in this line. For the rest of the file (before and after this line) I do not need to (and don't want to) take care about linebreaks. Ciao Andreas |
From: Athanasios A. <ath...@gm...> - 2014-08-29 12:29:08
|
Hello Andreas I briefly had a look at your example and I noticed two things: 1) Do you really need your parsed string to be enclosed in three single quotes? 2) The error you are getting is reasonable. The rule p expects to start at the beginning of the parsed string. Be defining g as 'b'+p, you are placing rule p AFTER the beginning of the line (which is where the character 'b' will be). Removing LineStart and enclosing the last string in single quotes (and of course, escaping the line breaks) seems to be working as expected (?). If you want to parse the string EXACTLY as it appears (including the line breaks) you will have to include them in your rules. Hope this helps AA On Thu, Aug 28, 2014 at 10:56 PM, Andreas Matthias < and...@gm...> wrote: > The following example works as expected. But if I uncomment the > line ``p.setParseAction(foo)'' I get the error: > > pyparsing.ParseException: Expected start of line (at char 1), (line:1, > col:1) > > I do not understand what's going on here. Any help? > > Ciao > Andreas > > > > import pyparsing as pypa > > p = \ > pypa.LineStart() + \ > pypa.OneOrMore( pypa.Literal('a').setWhitespaceChars(' ') ) + \ > pypa.LineEnd() > > def foo (t): > print(t) > > #p.setParseAction(foo) > > g = pypa.Literal('b') + p > > res = g.parseString('''b > a a > a > ''') > > print(res) > > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > |
From: Andreas M. <and...@gm...> - 2014-08-28 22:00:24
|
The following example works as expected. But if I uncomment the line ``p.setParseAction(foo)'' I get the error: pyparsing.ParseException: Expected start of line (at char 1), (line:1, col:1) I do not understand what's going on here. Any help? Ciao Andreas import pyparsing as pypa p = \ pypa.LineStart() + \ pypa.OneOrMore( pypa.Literal('a').setWhitespaceChars(' ') ) + \ pypa.LineEnd() def foo (t): print(t) #p.setParseAction(foo) g = pypa.Literal('b') + p res = g.parseString('''b a a a ''') print(res) |