pyparsing-users Mailing List for Python parsing module (Page 7)
Brought to you by:
ptmcg
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(12) |
May
(2) |
Jun
|
Jul
|
Aug
(12) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2006 |
Jan
(5) |
Feb
(1) |
Mar
(10) |
Apr
(3) |
May
(7) |
Jun
(2) |
Jul
(2) |
Aug
(7) |
Sep
(8) |
Oct
(17) |
Nov
|
Dec
(3) |
2007 |
Jan
(4) |
Feb
|
Mar
(10) |
Apr
|
May
(6) |
Jun
(11) |
Jul
(1) |
Aug
|
Sep
(19) |
Oct
(8) |
Nov
(32) |
Dec
(8) |
2008 |
Jan
(12) |
Feb
(6) |
Mar
(42) |
Apr
(47) |
May
(17) |
Jun
(15) |
Jul
(7) |
Aug
(2) |
Sep
(13) |
Oct
(6) |
Nov
(11) |
Dec
(3) |
2009 |
Jan
(2) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
(13) |
Jul
(19) |
Aug
(17) |
Sep
(8) |
Oct
(3) |
Nov
(7) |
Dec
(1) |
2010 |
Jan
(2) |
Feb
|
Mar
(19) |
Apr
(6) |
May
|
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
(2) |
2011 |
Jan
(4) |
Feb
|
Mar
(5) |
Apr
(1) |
May
(3) |
Jun
(8) |
Jul
(6) |
Aug
(8) |
Sep
(35) |
Oct
(1) |
Nov
(1) |
Dec
(2) |
2012 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
(4) |
May
|
Jun
(1) |
Jul
|
Aug
(6) |
Sep
(18) |
Oct
|
Nov
(1) |
Dec
|
2013 |
Jan
(7) |
Feb
(7) |
Mar
(1) |
Apr
(4) |
May
|
Jun
|
Jul
(1) |
Aug
(5) |
Sep
(3) |
Oct
(11) |
Nov
(3) |
Dec
|
2014 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
(6) |
May
(10) |
Jun
(4) |
Jul
|
Aug
(5) |
Sep
(2) |
Oct
(4) |
Nov
(1) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(13) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(9) |
Oct
(2) |
Nov
(11) |
Dec
(2) |
2016 |
Jan
|
Feb
(3) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(4) |
2017 |
Jan
(2) |
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
2018 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2023 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
From: lilydjwg <lil...@gm...> - 2013-04-12 08:23:20
|
I find the following code can print out two kinds of result: from pyparsing import * word = Word(alphas).setResultsName('word') f = Forward().setResultsName('forward') f << word + Optional(f) r = f.parseString('abc def') print(r.asXML()) Run it multiple times. Sometimes it prints what is expected: <forward> <word>abc</word> <word>def</word> </forward> But sometimes I get this: <forward> <forward>abc</forward> <forward>def</forward> </forward> pyparsing version 2.0.0. Old version is ok. This is caused by this statement around line 444 in asXML: namedItems = dict((v[1],k) for (k,vlist) in self.__tokdict.items() for v in vlist) v[1] in different items may be the same for some reason. Sorry I can't make a bug report on SourceForge because it refuses me to register. -- Best regards, lilydjwg |
From: Mika S. <mik...@gm...> - 2013-04-12 07:54:32
|
Hi, I do have small application which is using pyparsing from multiple threads. The pyparsing is singleton and also the actual parseString is inside Lock()s, so it should be thread safe. (Below cuts from the script) The problem is that after the parseBlock has returned the ParseResults for me, and I go through the whole list, I can not get free the the full ParseResult dictionaries and if my parseBlock is called quite many times with different scripts, I end in the situation where I do have huge amount of dictionaries (len(objgraph.by_type('dict'))) and loosing memory bit by bit. I have tried deleting the entries with del, but haven't fully figured out the correct way of cleaning the ParseResults. How could I do the deleting for returned ParseResults ? I have tested using both scanString and parseString for my case, but I think parseString would be more suitable. And both raises the memory usage. Thank you very much for any tips and huge thanks for pyparsing, -Mika @MySingleton class MyScriptParser: def __init__(self): self.syntax() def syntax(self): LPAR,RPAR,LBRACE,RBRACE,SEMI,COMMA,PROCENT,DOL = map(Suppress, "(){};,%$") # Types NAME = Word(alphas+"_", alphanums+"_") NUMBER = Word(nums) STRING = QuotedString('"') VARSTR = dblQuotedString CALL = Keyword("call") IF = Keyword("if") FOR = Keyword("for") FUNC = Suppress("function") PRINT = Keyword("print") ELSE = Keyword("else") # Collection types var = DOL + NAME | VARSTR # Arithmetic expression operand = NAME | var | NUMBER | STRING expr = Forward() expr << (operatorPrecedence(operand, [ ("!", 1, opAssoc.LEFT), (oneOf("+ -"), 1, opAssoc.RIGHT), # leading sign (oneOf("++ --"), 1, opAssoc.RIGHT), # Add / Substract (oneOf("++ --"), 1, opAssoc.LEFT), # Add / substract (oneOf("* / %"), 2, opAssoc.LEFT), # Multiply (oneOf("+ -"), 2, opAssoc.LEFT), # Add / Substract (oneOf("< == > <= >= !="), 2, opAssoc.LEFT), # Coparation ("=", 2, opAssoc.LEFT) # Assign ]) + Optional(LPAR + Group(Optional(delimitedList(expr))) + RPAR)) expr.setParseAction(createTokenObject) # Initialize Statement stmt = Forward() # Body body = ZeroOrMore(stmt) # Function funcdecl = FUNC - Dict(Group(OneOrMore(STRING + LPAR + Group(Optional(Group(delimitedList(var)))) + RPAR + LBRACE + Group(body) + RBRACE))) #funcdecl.setName("funcdecl").setDebug() funcdecl.setName("funcdecl") funcdecl.setParseAction(createTokenObject) # Keyword statements ifstmt = OneOrMore(Group(IF + LPAR + expr + RPAR + Group(stmt) + Optional(Group(ELSE + Group(stmt))))) #ifstmt.setName("ifstmt").setDebug() ifstmt.setName("ifstmt") ifstmt.setParseAction(createTokenObject) callstmt = Group(CALL + LPAR + Group(Optional(delimitedList(var))) + RPAR) + SEMI # callstmt.setName("callstmt").setDebug() callstmt.setName("callstmt") callstmt.setParseAction(createTokenObject) forstmt = Group(FOR + LPAR + Group(Optional(expr) + SEMI + Optional(expr) + SEMI + Optional(expr)) + RPAR + Group(stmt)) #forstmt.setName("forstmt").setDebug() forstmt.setName("forstmt") forstmt.setParseAction(createTokenObject) printstmt = Group(PRINT + LPAR + Optional(delimitedList(var)) + Optional(STRING + Optional(PROCENT + LPAR + delimitedList(var) + RPAR)) + RPAR) + SEMI #printstmt.setName("printstmt").setDebug() printstmt.setName("printstmt") printstmt.setParseAction(createTokenObject) genericstmt = Group(NAME + LPAR + Group(Optional(delimitedList(var))) + RPAR) + SEMI # genericstmt.setName("genericstmt").setDebug() genericstmt.setName("genericstmt") genericstmt.setParseAction(createTokenObject) # Setup statement stmt << (callstmt | ifstmt | forstmt | printstmt | genericstmt | expr + SEMI | LBRACE + ZeroOrMore(stmt) + RBRACE) # Main program self.program = ZeroOrMore(funcdecl) self.program.ignore(pythonStyleComment) ParserElement.enablePackrat() def parseBlock(self, script): # Parse the script myglobalvariablehere.acquire() parsed = self.program.parseString(script, parseAll=True) # parsed = self.program.scanString(script) myglobalvariablehere.release() # And return the list return parsed |
From: John W. S. <jo...@nm...> - 2013-03-07 23:21:21
|
Here is a new resource for pyparsing users: http://www.nmt.edu/tcc/help/pubs/pyparsing pyparsing quick reference: A Python text processing tool This is a supplement to, not a replacement for, the online reference documentation. There are lots of short conversational examples showing how the various features work. It should not be construed as the official word on any pyparsing feature, although we will of course strive to make it match how the software works and speedily evaluate any corrections or suggestions. This document is available as a 48-page PDF as well as a fully chunked HTML version. I wrote it in part because I'm a paper-oriented dinosaur and the online docs are unprintable. There is a section on how to structure large complex ParseResults instances, based on principles I worked out in my latest pyparsing project. Not every feature is covered. I left out a few features because I couldn't understand what they were for or how they worked. Also omitted were all the SGML/XML-family features because in my personal opinion the lxml module is vastly superior for reading or writing such documents. Also linked within this document are two extended examples. 1. http://www.nmt.edu/~shipman/aba/raw/doc/ims abaraw internal maintenance specification This script parses a free-field data format and writes XML. There are about 30 productions in the grammar. It implements the grammar described in the accommpanying spec: http://www.nmt.edu/~shipman/aba/raw/doc abaraw: A shorthand notation for bird records 2. http://www.nmt.edu/tcc/help/lang/python/examples/icalparse/ A parser for the "old ical", a Unix calendar program. The grammar has about a dozen productions. Includes an example of the Forward pattern. Best regards, John Shipman (jo...@nm...), Applications Specialist New Mexico Tech Computer Center, Speare 146, Socorro, NM 87801 (575) 835-5735, http://www.nmt.edu/~john ``Let's go outside and commiserate with nature.'' --Dave Farber |
From: Álvaro J. [T. <alv...@gm...> - 2013-02-27 15:26:32
|
On Wed, Feb 27, 2013 at 11:51 AM, <pt...@au...> wrote: > Alvaro - > > Unfortunately this is the best I can offer. We are in the awkward in-between Python2-or-Python3 compatibility period, which is likely to last a few more years at least. My previous approach to handling the version dichotomy was error-prone, and I finally decided to just move forward to Python 3 as my main supported Python version. Starting with 2.0.1, I will start converting over the supporting code and examples to be more Python3 idiomatic - this of course will not affect the 1.5.x branch of pyparsing. Fortunately both pip and easy_install offer a command syntax to explicitly request a particular version of pyparsing, so Python 2 users do have a not-too-terrible workaround for the default behavior of installing the latest version of a package. You can maintain a version for Python2 and another version for Python3 of the same package, just changing its setup.py to detect which Python version in running during the installation, like httplib2 does: https://code.google.com/p/httplib2/source/browse/setup.py []s > -- Paul > > > ---- "Álvaro Justen [Turicas]" <alv...@gm...> wrote: >> El 27/02/2013 06:01, "Paul McGuire" <pt...@au...> escribió: >> >> Can you precede your install of pydot with: >> >> pip install pyparsing==1.5.7 > > Yes, I know I can do it, but sincerely I don't want to since my library > does not depends directly on pyparsing. > >> Hopefully this will explicitly load the Python2-compatible version of > pyparsing, and then pydot won't need to try to autoresolve it. > > *Seriously* that you will break everything that needs pyparsing and uses > Python 2? > >> -- Paul >> >> >> -----Original Message----- >> From: Álvaro Justen [Turicas] [mailto:alv...@gm...] >> Sent: Wednesday, February 27, 2013 2:33 AM >> To: pyp...@li... >> Subject: [Pyparsing] Problems when installing version 2.0.0 using pip >> >> Hello, >> >> I'm developing a project[https://github.com/NAMD/pypelinin] that uses > python-graph-dot[https://pypi.python.org/pypi/python-graph-dot], which > depends on pyparsing. >> I figured out that you released version 2.0.0 some hours ago because I > tried to package my library to test some stuff and the installation process > failed by a SyntaxError when pip was trying to install pyparsing. >> >> I'm using Python 2.7.3 and all the log can be found at: >> https://gist.github.com/turicas/5046284 >> >> Basically, the SyntaxError exception is raised on file "pyparsing.py", > line 629 ("nonlocal limit,foundArity"). >> To reproduce this problem you just need to install it in a new virtualenv: >> >> cd /tmp >> virtualenv pyparsing-test >> source pyparsing-test/bin/activate >> pip install python-graph-dot >> deactivate >> rm -rf pyparsing-test >> >> Is there any plans on fixing this problem? I'm considering removing > python-graph-dot dependency in my library for now because I simply can't > install it because of this bug. >> >> Thanks, >> []s >> -- >> Álvaro Justen "Turicas" >> http://blog.justen.eng.br http://twitter.com/turicas >> http://CursoDeArduino.com.br http://github.com/turicas >> +55 21 9898-0141 >> >> > ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics Download AppDynamics Lite for > free today: >> http://p.sf.net/sfu/appdyn_d2d_feb >> _______________________________________________ >> Pyparsing-users mailing list >> Pyp...@li... >> https://lists.sourceforge.net/lists/listinfo/pyparsing-users >> > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > > -- Álvaro Justen "Turicas" http://blog.justen.eng.br http://twitter.com/turicas http://CursoDeArduino.com.br http://github.com/turicas +55 21 9898-0141 |
From: <pt...@au...> - 2013-02-27 14:51:37
|
Alvaro - Unfortunately this is the best I can offer. We are in the awkward in-between Python2-or-Python3 compatibility period, which is likely to last a few more years at least. My previous approach to handling the version dichotomy was error-prone, and I finally decided to just move forward to Python 3 as my main supported Python version. Starting with 2.0.1, I will start converting over the supporting code and examples to be more Python3 idiomatic - this of course will not affect the 1.5.x branch of pyparsing. Fortunately both pip and easy_install offer a command syntax to explicitly request a particular version of pyparsing, so Python 2 users do have a not-too-terrible workaround for the default behavior of installing the latest version of a package. -- Paul ---- "Álvaro Justen [Turicas]" <alv...@gm...> wrote: > El 27/02/2013 06:01, "Paul McGuire" <pt...@au...> escribió: > > Can you precede your install of pydot with: > > pip install pyparsing==1.5.7 Yes, I know I can do it, but sincerely I don't want to since my library does not depends directly on pyparsing. > Hopefully this will explicitly load the Python2-compatible version of pyparsing, and then pydot won't need to try to autoresolve it. *Seriously* that you will break everything that needs pyparsing and uses Python 2? > -- Paul > > > -----Original Message----- > From: Álvaro Justen [Turicas] [mailto:alv...@gm...] > Sent: Wednesday, February 27, 2013 2:33 AM > To: pyp...@li... > Subject: [Pyparsing] Problems when installing version 2.0.0 using pip > > Hello, > > I'm developing a project[https://github.com/NAMD/pypelinin] that uses python-graph-dot[https://pypi.python.org/pypi/python-graph-dot], which depends on pyparsing. > I figured out that you released version 2.0.0 some hours ago because I tried to package my library to test some stuff and the installation process failed by a SyntaxError when pip was trying to install pyparsing. > > I'm using Python 2.7.3 and all the log can be found at: > https://gist.github.com/turicas/5046284 > > Basically, the SyntaxError exception is raised on file "pyparsing.py", line 629 ("nonlocal limit,foundArity"). > To reproduce this problem you just need to install it in a new virtualenv: > > cd /tmp > virtualenv pyparsing-test > source pyparsing-test/bin/activate > pip install python-graph-dot > deactivate > rm -rf pyparsing-test > > Is there any plans on fixing this problem? I'm considering removing python-graph-dot dependency in my library for now because I simply can't install it because of this bug. > > Thanks, > []s > -- > Álvaro Justen "Turicas" > http://blog.justen.eng.br http://twitter.com/turicas > http://CursoDeArduino.com.br http://github.com/turicas > +55 21 9898-0141 > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Álvaro J. [T. <alv...@gm...> - 2013-02-27 13:29:46
|
El 27/02/2013 06:01, "Paul McGuire" <pt...@au...> escribió: > > Can you precede your install of pydot with: > > pip install pyparsing==1.5.7 Yes, I know I can do it, but sincerely I don't want to since my library does not depends directly on pyparsing. > Hopefully this will explicitly load the Python2-compatible version of pyparsing, and then pydot won't need to try to autoresolve it. *Seriously* that you will break everything that needs pyparsing and uses Python 2? > -- Paul > > > -----Original Message----- > From: Álvaro Justen [Turicas] [mailto:alv...@gm...] > Sent: Wednesday, February 27, 2013 2:33 AM > To: pyp...@li... > Subject: [Pyparsing] Problems when installing version 2.0.0 using pip > > Hello, > > I'm developing a project[https://github.com/NAMD/pypelinin] that uses python-graph-dot[https://pypi.python.org/pypi/python-graph-dot], which depends on pyparsing. > I figured out that you released version 2.0.0 some hours ago because I tried to package my library to test some stuff and the installation process failed by a SyntaxError when pip was trying to install pyparsing. > > I'm using Python 2.7.3 and all the log can be found at: > https://gist.github.com/turicas/5046284 > > Basically, the SyntaxError exception is raised on file "pyparsing.py", line 629 ("nonlocal limit,foundArity"). > To reproduce this problem you just need to install it in a new virtualenv: > > cd /tmp > virtualenv pyparsing-test > source pyparsing-test/bin/activate > pip install python-graph-dot > deactivate > rm -rf pyparsing-test > > Is there any plans on fixing this problem? I'm considering removing python-graph-dot dependency in my library for now because I simply can't install it because of this bug. > > Thanks, > []s > -- > Álvaro Justen "Turicas" > http://blog.justen.eng.br http://twitter.com/turicas > http://CursoDeArduino.com.br http://github.com/turicas > +55 21 9898-0141 > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users > |
From: Paul M. <pt...@au...> - 2013-02-27 09:46:23
|
Can you precede your install of pydot with: pip install pyparsing==1.5.7 Hopefully this will explicitly load the Python2-compatible version of pyparsing, and then pydot won't need to try to autoresolve it. -- Paul -----Original Message----- From: Álvaro Justen [Turicas] [mailto:alv...@gm...] Sent: Wednesday, February 27, 2013 2:33 AM To: pyp...@li... Subject: [Pyparsing] Problems when installing version 2.0.0 using pip Hello, I'm developing a project[https://github.com/NAMD/pypelinin] that uses python-graph-dot[https://pypi.python.org/pypi/python-graph-dot], which depends on pyparsing. I figured out that you released version 2.0.0 some hours ago because I tried to package my library to test some stuff and the installation process failed by a SyntaxError when pip was trying to install pyparsing. I'm using Python 2.7.3 and all the log can be found at: https://gist.github.com/turicas/5046284 Basically, the SyntaxError exception is raised on file "pyparsing.py", line 629 ("nonlocal limit,foundArity"). To reproduce this problem you just need to install it in a new virtualenv: cd /tmp virtualenv pyparsing-test source pyparsing-test/bin/activate pip install python-graph-dot deactivate rm -rf pyparsing-test Is there any plans on fixing this problem? I'm considering removing python-graph-dot dependency in my library for now because I simply can't install it because of this bug. Thanks, []s -- Álvaro Justen "Turicas" http://blog.justen.eng.br http://twitter.com/turicas http://CursoDeArduino.com.br http://github.com/turicas +55 21 9898-0141 ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Álvaro J. [T. <alv...@gm...> - 2013-02-27 08:34:47
|
Hello, I'm developing a project[https://github.com/NAMD/pypelinin] that uses python-graph-dot[https://pypi.python.org/pypi/python-graph-dot], which depends on pyparsing. I figured out that you released version 2.0.0 some hours ago because I tried to package my library to test some stuff and the installation process failed by a SyntaxError when pip was trying to install pyparsing. I'm using Python 2.7.3 and all the log can be found at: https://gist.github.com/turicas/5046284 Basically, the SyntaxError exception is raised on file "pyparsing.py", line 629 ("nonlocal limit,foundArity"). To reproduce this problem you just need to install it in a new virtualenv: cd /tmp virtualenv pyparsing-test source pyparsing-test/bin/activate pip install python-graph-dot deactivate rm -rf pyparsing-test Is there any plans on fixing this problem? I'm considering removing python-graph-dot dependency in my library for now because I simply can't install it because of this bug. Thanks, []s -- Álvaro Justen "Turicas" http://blog.justen.eng.br http://twitter.com/turicas http://CursoDeArduino.com.br http://github.com/turicas +55 21 9898-0141 |
From: Michael D. <md...@st...> - 2013-02-26 19:39:07
|
It looks like there's no files associated with 1.5.7 or 2.0.0, though there are for 1.5.6. I think that's why it's still only giving my 1.5.6. Would you mind updating the PyPI records? Mike On 02/26/2013 02:28 PM, Michael Droettboom wrote: > It seems that `pip install pyparsing` (in a clean environment) installs > version 1.5.6 for both Python 2.7 and 3.3. I'm generally quite lost poking > around PyPI, but is there a reason why it's not giving me the latest > version? > > Mike > |
From: Michael D. <md...@gm...> - 2013-02-26 19:29:21
|
It seems that `pip install pyparsing` (in a clean environment) installs version 1.5.6 for both Python 2.7 and 3.3. I'm generally quite lost poking around PyPI, but is there a reason why it's not giving me the latest version? Mike -- Michael Droettboom http://www.droettboom.com/ |
From: Paul M. <pt...@au...> - 2013-01-16 04:11:57
|
Geoff - Congratulations on your first steps with pyparsing. You have found scanString and how it returns the start and end locations of each match. Pyparsing also includes transformString which is a wrapper around scanString to do the kind of injection function you are doing. transformString applies all parse actions that can modify or enhance the parsed strings by returning a different string than the one passed in in the tokens argument. See how I've added a parse action to a slightly different version of your word expression: word = Word(alphas, printables,excludeChars='<>&') word.ignore(anyOpenTag) word.ignore(anyCloseTag) word.ignore(commonHTMLEntity) tagnum = 0 def addMarkTags(tokens): global tagnum tagnum += 1 return "<mark id='%d'>%s</mark>" % (tagnum, tokens[0]) word.setParseAction(addMarkTags) print word.transformString(html) This will print: <h2 class="chapterNumber"><span class="bold"><mark id='1'>S</mark></span><mark id='2'>ome</mark> <mark id='3'>Book</mark></h2> <p class="para"><mark id='4'>I</mark> <mark id='5'>have</mark> <mark id='6'>so</mark> <mark id='7'>many</mark> <span class="italic"><mark id='8'>National</mark> <mark id='9'>Geographic</mark></span>&<mark id='10'>rsquo;s</mark> <mark id='11'>at</mark> <mark id='12'>home.</mark></p> I think transformString is the avenue to follow for this project. -- Paul -----Original Message----- From: Geoff Jukes [mailto:fo...@ju...] Sent: Tuesday, January 15, 2013 3:01 PM To: pyp...@li... Subject: Re: [Pyparsing] HTML Injection OR Word boundary detection from HTML With the following: ----- from pyparsing import * html = EXAMPLE_HTML_FROM_PREVIOUS_POST word = Word(printables) word.ignore(anyOpenTag) word.ignore(anyCloseTag) word.ignore(commonHTMLEntity) text = word for w, s, e in text.scanString(html): print '%s between %s and %s' %(w, s, e, html[s:e]) ----- I get (with some ommisions): ----- .... ['I'] between 2443 and 2444 [From HTML: I] ['have'] between 2445 and 2449 [From HTML: have] ['so'] between 2450 and 2452 [From HTML: so] ['many'] between 2453 and 2457 [From HTML: many] ['National'] between 2479 and 2487 [From HTML: National] ['Geographic</span>’s'] between 2488 and 2513 [From HTML: Geographic</span>’s] ['at'] between 2514 and 2516 [From HTML: at] ['home.</p>'] between 2517 and 2526 [From HTML: home.</p>] ----- Which is a *great* start. From here, if I could: 1) Suppress any HTML tags in the string 2) Check the HTML Entities against a list of 'splits' (e.g. endah, emdash etc) and convert those to space, otherwise convert the entity to UTF8. Then I'd be good to go I think! I can then use the word-boundaries to inject the tags, and use the parsed string for my secondary process (which I need a UTF8 string for). ---------------------------------------------------------------------------- -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512 _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Geoff J. <fo...@ju...> - 2013-01-15 21:01:30
|
With the following: ----- from pyparsing import * html = EXAMPLE_HTML_FROM_PREVIOUS_POST word = Word(printables) word.ignore(anyOpenTag) word.ignore(anyCloseTag) word.ignore(commonHTMLEntity) text = word for w, s, e in text.scanString(html): print '%s between %s and %s' %(w, s, e, html[s:e]) ----- I get (with some ommisions): ----- .... ['I'] between 2443 and 2444 [From HTML: I] ['have'] between 2445 and 2449 [From HTML: have] ['so'] between 2450 and 2452 [From HTML: so] ['many'] between 2453 and 2457 [From HTML: many] ['National'] between 2479 and 2487 [From HTML: National] ['Geographic</span>’s'] between 2488 and 2513 [From HTML: Geographic</span>’s] ['at'] between 2514 and 2516 [From HTML: at] ['home.</p>'] between 2517 and 2526 [From HTML: home.</p>] ----- Which is a *great* start. From here, if I could: 1) Suppress any HTML tags in the string 2) Check the HTML Entities against a list of 'splits' (e.g. endah, emdash etc) and convert those to space, otherwise convert the entity to UTF8. Then I'd be good to go I think! I can then use the word-boundaries to inject the tags, and use the parsed string for my secondary process (which I need a UTF8 string for). |
From: Geoff J. <fo...@ju...> - 2013-01-15 19:36:28
|
Hi, First - Sorry for the long email and lack of PyParsing example code. I'm trying to modify some HTML, wrapping 'words' in 'MARK' tags. I've tried BeautifulSoup, HTMLParser, and Regex's, all with limited success. I think PyParsing is the right solution - all the other solutions are more for scraping/extracting data from HTML. I hate asking questions without some code, but I'm so new tto PyParsing that I really am not sure where to start. My gut tells me it's the right tool for the job though. Can anyone help me? Take the following HTML as an example: ----- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US"> <head> <title>Some Book</title> <link rel="stylesheet" type="application/vnd.adobe-page-template+xml" href="page-template.xpgt"/> <style> .italic {font-style: italic;} .bold {font-style: bold;} </style> </head> <body class="text" id="text"> <div class="chapter" id="ch02"> <div class="chapterHead"> <h2 class="chapterNumber"><span class="bold">S</span>ome Book</h2> </div> <div class="chapterBody"> <p class="para">I have so many <span class="italic">National Geographic</span>’s at home.</p> </div> </div> </body> </html> ----- There are 2 lines of interest: ----- <p class="para">I have so many <span class="italic">National Geographic</span>’s at home.</p> <h2 class="chapterNumber"><span class="bold">S</span>ome Book</h2> ----- I am tring to wrap the 'words' in 'MARK' tags. So my 'perfect' result would be: ----- <h2 class="chapterNumber"><mark id='1'><span class="bold">S</span>ome</mark> <mark id='1'>Book</mark></h2> <p class="para"><mark id='1'>I</mark> <mark id='2'>have</mark> <mark id='3'>so</mark> <mark id='4'>many</mark> <mark id='5'><span class="italic">National</span></mark> <mark id='6'><span class="italic">Geographic</span>’s</mark> <mark id='7'>at</mark> <mark id='8'>home</mark>.</p> ----- Now there is obviously some complexity in there, over and above the 'mark' injection. For example, the word "Geographic's" is split by a close-span, which started before the word 'National'. So the formatting is 'replayed' during the injection. There are also some differences in the 'MARK' location - Sometimes 'tight' to the word, sometimes with 'SPAN' tags inside. I don't expect PyParser to be able to do that for me (I would love it if it could!) and so I am happy to have PyParser generate 'broken' HTML, that I will fix-up post-process. So the following output would be acceptible: ----- <h2 class="chapterNumber"><span class="bold"><mark id='1'>S</span>ome</mark> <mark id='1'>Book</mark></h2> <p class="para"><mark id='1'>I</mark> <mark id='2'>have</mark> <mark id='3'>so</mark> <mark id='4'>many</mark> <span class="italic"><mark id='5'>National</mark> <mark id='6'>Geographic</span>’s</mark> <mark id='7'>at</mark> <mark id='8'>home</mark>.</p> ----- Note that the "National Geographics's" are now 'broken'. A 'Word' can be described as: Any text that is terminated with a space or Punctuation, but excluding the terminator. An added complexity in my full-file is that Quote marks could also terminate a word, but only if it's not an apostrophe (e.g. "I'm excited" has 2 words (I'm, Excited). And Quotes could be HTMLEntities. But again, I am happy to deal with that post-process. An acceptable alternative would be for PyParser to return the start and end locations of 'whole' words (taking into account any interspersed HTML like the close-span in Geographic's) then I can 'shuffle' the Mark tag injection post-process. Again, I'm sorry for not posting example code - I'm still wrapping my head around how PyParser works. So if anyone can give me pointers, I'm happy to do the legwork myself! I'm going to spend all day trying to work this out. If I can get the start and end locations of 'whole' words (taking into account any interspersed HTML like the close-span in Geographic's) then I can 'shuffle' the Mark tag injection post-process. Many thanks in advance, Geoff |
From: Florian L. <mai...@xg...> - 2013-01-07 13:21:55
|
Hey Paul! Thanks for the welcome and the thorough explanation! I have more or less solved my original problem in the mean time, but still I'm at the very beginning! Unfortunately I don't have a formal description of the language I try to modell. It's thought along the lines of c++. So I'll need to fiddle with the allowed characters in my primitives. (BTW: It's the configuration language of the OpenFOAM CFD tool box http://www.openfoam.com/) My primitives are: ident = Word(alphanums + ".") semi = Literal(";").suppress() lcb = Literal("{").suppress() rcb = Literal("}").suppress() I'll keep in mind what you said about excludeChars and maybe change ident that way, I'll have to try out. One problem I've encountered is that a key-value pair could be like that key and all this is value; I catch that with: FKeyValue = Group(ident + SkipTo(semi) + semi) Since the file could also have key value pairs at the root level (not within any dict) I do: ParameterFile = ZeroOrMore(FDictionary | FKeyValue) Dictionaries could be arbitrarily nested FDictionary = Forward() FDictionary << Dict(Group(ident + lcb + Dict(ZeroOrMore(FKeyValue | FDictionary)) + rcb)) I still have problems getting the recursive definition right (the underlying problem is probably getting the recursive defintion right ;-) My sample text is: prob = """dictname { subdict { key value; key2 value2; } }""" and parsing.dump() that gives: [['dictname', ['subdict', '{\n key value'], ['key2', 'value2']]] - dictname: [['subdict', '{\n key value'], ['key2', 'value2']] - key2: value2 - subdict: { key value Thanks for any suggestions! Florian Am Samstag, 5. Januar 2013, 10:59:04 schrieb Paul McGuire: > Florian - > > Welcome to pyparsing! > > When writing your parser, you'll have to keep in mind that pyparsing does > not do any kind of lookahead unless you explicitly tell it to. "printables" > is a string containing all ASCII characters that are not whitespace - this > includes the ';' character. So when you define your FKeyValue value part as > "Word(printables)", this will consume all non-whitespace characters, even > the terminating ';'. This is in contrast to something you might do in a > regular expression, in which ".*;" would match "lslsd;" - the regular > expression implicitly terminates the ".*" when it sees the semicolon. But > pyparsing is purely left-to-right, unless you include some lookahead escapes > of your own. > > One way to do this in the Word construct is to be more selective in the > string that you use to create the expression - in this case, we'll try just > doing every printable character except for ';'. Instead of > "Word(printables)", you could do "Word(''.join(c for c in printables if c != > ';'))". I found myself doing this quite a lot and it annoyed me, so I > added a convenience argument to Word, excludeChars. You can define a Word > using a large string of characters, and then just exclude one or two of > them, in your case like this: Word(printables, excludeChars=';') Now if > you use this expression for your value expression in FKeyValue, it should > parse better. > > By extension, I would also suggest that you narrow down what you expect to > see as the identifiers in your key and dictionary, so that you don't > accidentally read in braces or other punctuation, perhaps something like: > > identifier = Word(alphas, alphanums) > FKeyValue = identifier + Word(printables,excludeChars=';') + ";" > FDictionary = identifier + "{" + OneOrMore( Group(FKeyValue) ) + "}" > > Also, by Grouping your FKeyValue's, it will help you iterate over the > key-value pairs, as it will give them more organizing structure. > > Please look over some of the articles that are linked from the wiki's > Documentation page (http://pyparsing.wikispaces.com/Documentation), for more > examples and expression topics. Also, the Discussion tab > (http://pyparsing.wikispaces.com/page/messages/home) of the wiki's Home page > includes many Q&A threads on various pyparsing problems. > > Best of luck, > -- Paul McGuire > > > > -----Original Message----- > From: Florian Lindner [mailto:mai...@xg...] > Sent: Saturday, January 05, 2013 4:40 AM > To: pyp...@li... > Subject: [Pyparsing] Beginner parsing problem > > Hello, > > I've just started working with pyparsing > > > from pyparsing import * > > text = """ > FoamFile > { > version 2.0; > format ascii; > class volVectorField; > object U; > }""" > > text2 = " class volVectorField;" > > FKeyValue = Word(printables) + Word(printables) + ";" > FDictionary = Word(printables) + "{" + OneOrMore( FKeyValue ) + "}" > > print FKeyValue.parseString(text2) # Works fine print > FDictionary.parseString(text) # Fails <<< > > (I use the F prefix to avoid name clashes with pyparsing stuff, might change > set and switch to more selective import). > > The last print fails: > > Traceback (most recent call last): > File "parse.py", line 22, in <module> > print FKeyValue.parseString(text2) > File "/home/florian/scratch/pyparsing.py", line 1006, in parseString > raise exc > pyparsing.ParseException: Expected ";" (at char 28), (line:1, col:29) > > > What is wrong there? If I understood the documentation right, newlines are > ignored, just like whitespace. > > It's pyparsing downloaded from the 1.5.x svn branch. > > Thanks, > Florian > > ---------------------------------------------------------------------------- > -- > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, > Windows 8 Apps, JavaScript and much more. Keep your skills current with > LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and > experts. SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122912 > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Florian L. <mai...@xg...> - 2013-01-05 20:16:52
|
Hey Paul! Thanks for the welcome and the thorough explanation! I have more or less solved my original problem in the mean time, but still I'm at the very beginning! Unfortunately I don't have a formal description of the language I try to model. It's thought along the lines of c++. So I'll need to fiddle with the allowed characters in my primitives. (BTW: It's the configuration language of the OpenFOAM CFD tool box http://www.openfoam.com/) My primitives are: ident = Word(alphanums + ".") semi = Literal(";").suppress() lcb = Literal("{").suppress() rcb = Literal("}").suppress() I'll keep in mind what you said about excludeChars and maybe change ident that way, I'll have to try out. One problem I've encountered is that a key-value pair could be like that key and all this is value; I catch that with: FKeyValue = Group(ident + SkipTo(semi) + semi) Since the file could also have key value pairs at the root level (not within any dict) I do: ParameterFile = ZeroOrMore(FDictionary | FKeyValue) Dictionaries could be arbitrarily nested FDictionary = Forward() FDictionary << Dict(Group(ident + lcb + Dict(ZeroOrMore(FKeyValue | FDictionary)) + rcb)) I still have problems getting the recursive definition right (the underlying problem is probably getting the recursive defintion right ;-) My sample text is: prob = """dictname { subdict { key value; key2 value2; } }""" and parsing.dump() that gives: [['dictname', ['subdict', '{\n key value'], ['key2', 'value2']]] - dictname: [['subdict', '{\n key value'], ['key2', 'value2']] - key2: value2 - subdict: { key value Thanks for any suggestions and have a nice weekend! Florian Am Samstag, 5. Januar 2013, 10:59:04 schrieb Paul McGuire: > Florian - > > Welcome to pyparsing! > > When writing your parser, you'll have to keep in mind that pyparsing does > not do any kind of lookahead unless you explicitly tell it to. "printables" > is a string containing all ASCII characters that are not whitespace - this > includes the ';' character. So when you define your FKeyValue value part as > "Word(printables)", this will consume all non-whitespace characters, even > the terminating ';'. This is in contrast to something you might do in a > regular expression, in which ".*;" would match "lslsd;" - the regular > expression implicitly terminates the ".*" when it sees the semicolon. But > pyparsing is purely left-to-right, unless you include some lookahead escapes > of your own. > > One way to do this in the Word construct is to be more selective in the > string that you use to create the expression - in this case, we'll try just > doing every printable character except for ';'. Instead of > "Word(printables)", you could do "Word(''.join(c for c in printables if c != > ';'))". I found myself doing this quite a lot and it annoyed me, so I > added a convenience argument to Word, excludeChars. You can define a Word > using a large string of characters, and then just exclude one or two of > them, in your case like this: Word(printables, excludeChars=';') Now if > you use this expression for your value expression in FKeyValue, it should > parse better. > > By extension, I would also suggest that you narrow down what you expect to > see as the identifiers in your key and dictionary, so that you don't > accidentally read in braces or other punctuation, perhaps something like: > > identifier = Word(alphas, alphanums) > FKeyValue = identifier + Word(printables,excludeChars=';') + ";" > FDictionary = identifier + "{" + OneOrMore( Group(FKeyValue) ) + "}" > > Also, by Grouping your FKeyValue's, it will help you iterate over the > key-value pairs, as it will give them more organizing structure. > > Please look over some of the articles that are linked from the wiki's > Documentation page (http://pyparsing.wikispaces.com/Documentation), for more > examples and expression topics. Also, the Discussion tab > (http://pyparsing.wikispaces.com/page/messages/home) of the wiki's Home page > includes many Q&A threads on various pyparsing problems. > > Best of luck, > -- Paul McGuire > > > > -----Original Message----- > From: Florian Lindner [mailto:mai...@xg...] > Sent: Saturday, January 05, 2013 4:40 AM > To: pyp...@li... > Subject: [Pyparsing] Beginner parsing problem > > Hello, > > I've just started working with pyparsing > > > from pyparsing import * > > text = """ > FoamFile > { > version 2.0; > format ascii; > class volVectorField; > object U; > }""" > > text2 = " class volVectorField;" > > FKeyValue = Word(printables) + Word(printables) + ";" > FDictionary = Word(printables) + "{" + OneOrMore( FKeyValue ) + "}" > > print FKeyValue.parseString(text2) # Works fine print > FDictionary.parseString(text) # Fails <<< > > (I use the F prefix to avoid name clashes with pyparsing stuff, might change > set and switch to more selective import). > > The last print fails: > > Traceback (most recent call last): > File "parse.py", line 22, in <module> > print FKeyValue.parseString(text2) > File "/home/florian/scratch/pyparsing.py", line 1006, in parseString > raise exc > pyparsing.ParseException: Expected ";" (at char 28), (line:1, col:29) > > > What is wrong there? If I understood the documentation right, newlines are > ignored, just like whitespace. > > It's pyparsing downloaded from the 1.5.x svn branch. > > Thanks, > Florian > > ---------------------------------------------------------------------------- > -- > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, > Windows 8 Apps, JavaScript and much more. Keep your skills current with > LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and > experts. SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122912 > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Paul M. <pt...@au...> - 2013-01-05 16:59:23
|
Florian - Welcome to pyparsing! When writing your parser, you'll have to keep in mind that pyparsing does not do any kind of lookahead unless you explicitly tell it to. "printables" is a string containing all ASCII characters that are not whitespace - this includes the ';' character. So when you define your FKeyValue value part as "Word(printables)", this will consume all non-whitespace characters, even the terminating ';'. This is in contrast to something you might do in a regular expression, in which ".*;" would match "lslsd;" - the regular expression implicitly terminates the ".*" when it sees the semicolon. But pyparsing is purely left-to-right, unless you include some lookahead escapes of your own. One way to do this in the Word construct is to be more selective in the string that you use to create the expression - in this case, we'll try just doing every printable character except for ';'. Instead of "Word(printables)", you could do "Word(''.join(c for c in printables if c != ';'))". I found myself doing this quite a lot and it annoyed me, so I added a convenience argument to Word, excludeChars. You can define a Word using a large string of characters, and then just exclude one or two of them, in your case like this: Word(printables, excludeChars=';') Now if you use this expression for your value expression in FKeyValue, it should parse better. By extension, I would also suggest that you narrow down what you expect to see as the identifiers in your key and dictionary, so that you don't accidentally read in braces or other punctuation, perhaps something like: identifier = Word(alphas, alphanums) FKeyValue = identifier + Word(printables,excludeChars=';') + ";" FDictionary = identifier + "{" + OneOrMore( Group(FKeyValue) ) + "}" Also, by Grouping your FKeyValue's, it will help you iterate over the key-value pairs, as it will give them more organizing structure. Please look over some of the articles that are linked from the wiki's Documentation page (http://pyparsing.wikispaces.com/Documentation), for more examples and expression topics. Also, the Discussion tab (http://pyparsing.wikispaces.com/page/messages/home) of the wiki's Home page includes many Q&A threads on various pyparsing problems. Best of luck, -- Paul McGuire -----Original Message----- From: Florian Lindner [mailto:mai...@xg...] Sent: Saturday, January 05, 2013 4:40 AM To: pyp...@li... Subject: [Pyparsing] Beginner parsing problem Hello, I've just started working with pyparsing >>> from pyparsing import * text = """ FoamFile { version 2.0; format ascii; class volVectorField; object U; }""" text2 = " class volVectorField;" FKeyValue = Word(printables) + Word(printables) + ";" FDictionary = Word(printables) + "{" + OneOrMore( FKeyValue ) + "}" print FKeyValue.parseString(text2) # Works fine print FDictionary.parseString(text) # Fails <<< (I use the F prefix to avoid name clashes with pyparsing stuff, might change set and switch to more selective import). The last print fails: Traceback (most recent call last): File "parse.py", line 22, in <module> print FKeyValue.parseString(text2) File "/home/florian/scratch/pyparsing.py", line 1006, in parseString raise exc pyparsing.ParseException: Expected ";" (at char 28), (line:1, col:29) What is wrong there? If I understood the documentation right, newlines are ignored, just like whitespace. It's pyparsing downloaded from the 1.5.x svn branch. Thanks, Florian ---------------------------------------------------------------------------- -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122912 _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Florian L. <mai...@xg...> - 2013-01-05 10:40:38
|
Hello, I've just started working with pyparsing >>> from pyparsing import * text = """ FoamFile { version 2.0; format ascii; class volVectorField; object U; }""" text2 = " class volVectorField;" FKeyValue = Word(printables) + Word(printables) + ";" FDictionary = Word(printables) + "{" + OneOrMore( FKeyValue ) + "}" print FKeyValue.parseString(text2) # Works fine print FDictionary.parseString(text) # Fails <<< (I use the F prefix to avoid name clashes with pyparsing stuff, might change set and switch to more selective import). The last print fails: Traceback (most recent call last): File "parse.py", line 22, in <module> print FKeyValue.parseString(text2) File "/home/florian/scratch/pyparsing.py", line 1006, in parseString raise exc pyparsing.ParseException: Expected ";" (at char 28), (line:1, col:29) What is wrong there? If I understood the documentation right, newlines are ignored, just like whitespace. It's pyparsing downloaded from the 1.5.x svn branch. Thanks, Florian |
From: Michael D. <md...@gm...> - 2012-11-16 14:13:09
|
Are there any plans for a release soon? In matplotlib on Python 3.x we are running against this bug, already fixed in SVN, and we'd like to stop shipping a patched version of pyparsing: http://pyparsing.svn.sourceforge.net/viewvc/pyparsing?view=revision&revision=219 Mike -- Michael Droettboom http://www.droettboom.com/ |
From: Claus R. <cla...@ro...> - 2012-09-25 10:00:46
|
Ok, i solved it, some chars were missing. Now i try to handle german umlauts, text = Word( alphas8bit + alphanums + " -,/.|:_()+;&" ) does not work, error message says /usr/lib/python2.7/dist-packages/pyparsing.py:1698: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal i try to mix it up with unicode as well text = Word( alphas8bit + alphanums + " -,/.|:_()+;&" + u"ü") but it does not work either. Is there any alphanums to use with unicode? Claus Am 20.09.2012 12:30:14, schrieb Claus Rosenberger: > Ok, i changed it to the minimal set i need at the moment. > text = Word( alphanums + "," + " ") > > > But i have another problem with that line it seems. > NETSET = "NetSet" + LBRACE + ZeroOrMore(Group(v_text | v_num | NETSET_PLIST | NETSET_NLIST)) + RBRACE > > > It's referencing to NETSET_PLIST which is defined as follows: > NETSET_PLIST = ( "list" + EQ + LBRACE + ZeroOrMore(NETENTRY) + RBRACE ) > > > Which means it should match "list={" and a few child options gut i get following message while parsing: > pyparsing.ParseException: Expected "}" (at char 302), (line:8, col:25) > > > Claus |
From: Rolf-Helge P. <ro...@it...> - 2012-09-24 14:48:27
|
Dear all, Have you already constructed a language using pyparsing? If so, we would appreciate if you participate in our brief survey on "Language Integration". Your responses to this survey will be used in a research project, which aims to better support developers working on software systems containing multiple domain-specific languages and general purpose languages. You will find the survey and further information following the link: http://www.itu.dk/people/ropf/survey.html Please excuse me for possibly spamming this group. Please also excuse me in case you consider this message a cross-post. This invitation is sent the following language engineering communities via their corresponding mailing lists or forums: Xtext, EMFText, ANTLR, parboiled, javacc, pyparsing. The reason is to allow for generalization of the survey's results by approaching multiple language engineering communities. Best regards, Helge ro...@it... Process and System Models Group IT University of Copenhagen |
From: Johannes M. <Joh...@ne...> - 2012-09-20 13:01:31
|
Hey there. I am currently working on adding proper error handling to my parser and ran into a problem with the ErrorStop functionality of the And-class. This works: CaselessKeyword("foo") + "bar" But using ErrorStop raises an AttributeError: CaselessKeyword("foo") - "bar" Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/dist-packages/pyparsing.py", line 1384, in __repr__ return _ustr(self) File "/usr/local/lib/python2.7/dist-packages/pyparsing.py", line 122, in _ustr return str(obj) File "/usr/local/lib/python2.7/dist-packages/pyparsing.py", line 2394, in __str__ File "/usr/local/lib/python2.7/dist-packages/pyparsing.py", line 122, in _ustr return str(obj) File "/usr/local/lib/python2.7/dist-packages/pyparsing.py", line 1381, in __str__ return self.name File "/usr/local/lib/python2.7/dist-packages/pyparsing.py", line 1424, in __getattr__ raise AttributeError("no such attribute " + aname) AttributeError: no such attribute name This is because when "And.__sub__" initialises the "_ErrorStop" object "_ErrorStop.__init__" calls: super(Empty, self).__init__(*args, **kwargs) instead of: super(_ErrorStop, self).__init__(*args, **kwargs) So the constructor of "Empty", which is setting "name", will be skipped. I suppose that is not intended? Best regards, -- Johannes Meyer Junior Application Developer NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nürnberg Tel: +49 911 92885-0 | Fax: +49 911 92885-77 GF: Julian Hein, Bernd Erk | AG Nürnberg HRB18461 | http://www.netways.de<http://www.netways.de/> | joh...@ne...<mailto:joh...@ne...> ** NETWAYS Open Source Monitoring Conference 2012 | Nürnberg, 17. und 18. Oktober 2012 | http://www.netways.de/osmc ** ** Puppet Camp 2012 | Nürnberg, 19. Oktober 2012 | http://www.netways.de/puppetcamp ** |
From: Claus R. <cla...@ro...> - 2012-09-20 10:30:50
|
Ok, i changed it to the minimal set i need at the moment. text = Word( alphanums + "," + " ") But i have another problem with that line it seems. NETSET = "NetSet" + LBRACE + ZeroOrMore(Group(v_text | v_num | NETSET_PLIST | NETSET_NLIST)) + RBRACE It's referencing to NETSET_PLIST which is defined as follows: NETSET_PLIST = ( "list" + EQ + LBRACE + ZeroOrMore(NETENTRY) + RBRACE ) Which means it should match "list={" and a few child options gut i get following message while parsing: pyparsing.ParseException: Expected "}" (at char 302), (line:8, col:25) Claus Am 18.09.2012 14:22:46, schrieb Paul McGuire: > In your definition of 'text', you only match a single word - "name={LAN > SIDE}" contains a text value with whitespace, more than one word. Also, > beware of making 'text' match too much, you may end up including your > trailing '}' as part of the text (as you will with your current definition > of text as "Word(printables)"). > > -- Paul > > > -----Original Message----- > From: Claus Rosenberger [mailto:> cla...@ro...> ] > Sent: Tuesday, September 18, 2012 3:13 AM > To: > pyp...@li... > Subject: [Pyparsing] Parsing object structure > > Hi, > > i try to parse an object structure which is using open and close tags > and containing lists of another objects. > > Example: > > NetSet{ > name={LAN SIDE} > oid={123998333,723663,2625521122} > readOnly={0} > origin={} > global={0} > comment={Voice Server UC} > list={ > NetEntry{ > name={} > readOnly={0} > origin={} > global={0} > comment={} > addr={192.169.0.0/24} > } > } > neglist={ > } > } > > > I tried to start with following code structure and don't know if it > makes sense or not. > > from pyparsing import * > > LBRACE,RBRACE,SEMI,EQ,PCT = map(Suppress,"{};=%") > comment = SEMI + restOfLine > > keyName = Word(alphas) > text = Word( alphanums, "-", ",") > text = Word(printables) > num = Word(nums) > ip = Combine(Word(nums) + ('.' + Word(nums))*3) > > v_text = keyName + EQ + LBRACE + Optional(text) + RBRACE > v_num = keyName + EQ + LBRACE + Optional(num) + RBRACE > v_ip = keyName + EQ + LBRACE + Optional(ip) + RBRACE > > NETENTRY = ( "NetEntry" + LBRACE + ZeroOrMore(Group(v_text | v_num | > v_ip)) + RBRACE ) > NETSET_PLIST = ( "list" + EQ + LBRACE + ZeroOrMore(NETENTRY) + RBRACE ) > NETSET_NLIST = "neglist" + EQ + LBRACE + ZeroOrMore(Group(NETENTRY)) + > RBRACE > NETSET = "NetSet" + LBRACE + ZeroOrMore(Group(v_text | v_num | > NETSET_PLIST | NETSET_NLIST)) + RBRACE > rule_ref = NETSET > > for mr in rule_ref.parseFile("sample"): > print mr > > I get following result: > > pyparsing.ParseException: Expected "}" (at char 48), (line:2, col:25) > > It seems NETSET_PLIST does not work, perhaps my code is the wrong approach. > > Thanks a lot > Claus > > ---------------------------------------------------------------------------- > -- > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. > http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pyparsing-users mailing list > Pyp...@li... > https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Eric S.. J. <es...@es...> - 2012-09-18 18:44:59
|
----- Original Message ----- > From: "Paul McGuire" <pt...@au...> > To: "Eric S.. Johansson" <es...@es...>, pyp...@li... > Sent: Tuesday, September 18, 2012 7:55:17 AM > Subject: RE: [Pyparsing] more refinement but still lost > > Sorry for my terse reply earlier - hit send too early! Not a problem. That happens to me with speech recognition errors. Sucks being disabled and then your tools bite you in the ass. > > Eric, you have definitely taken on an ambitious first-project for > pyparsing. Writing BNF's takes some practice, but it is important to really get > your thoughts down about how the parser is supposed to work before getting > mired down in Words, and Groups, Forwards, etc. In your nested terms, let > the recursion in the BNF take care of nesting []'s - when you have > LBRACK/RBRACK in two different levels of your nesting, it's a sign you should > rethink just how you have defined the contents of this group. Yes it is an ambitious project. As I've probably said earlier, I am disabled and as a result have become quite smart about user interfaces. Sadly, not being able to write code or webpages gets in the way of proving my intellectual capabilities vis-à-vis getting a job. The usability model behind this project should allow me to show off some of my UI chops.[1] The syntax was one I inherited from another project but it turned out to be a great base for a speech recognition friendly method of generating webpages. I know that the differentiation between arguments and keywords using the same bracket is problematic. I've been wrestling with that one for quite a while. One model says change the notation between keywords and arguments. another says change the notation completely so that the problem doesn't come up. I'm open to suggestions I think the fundamental structure should remain the same because it works for speech recognition use. Could use a little boost from a smart editor but hey, I will live with what I've got. in any case, I'm open to suggestions for reworking the syntax/notation Mixed text:: = [<plaintext>]* [<open square bracket> <keyword> [<open square bracket><argument> [<mixed text>]<close square bracket>]* [<mixed text>] <close square bracket>] [<plaintext>]* I think that's the best definition of how it is now. I'm trying to think of a way I can make arguments as functions you to go away and be replaced by keyword definitions everywhere > > Here's my earlier post, with annotating comments. Thank you. I'll take a look later today. --- eric [1] https://docs.google.com/document/d/1In11apApKozw_UOPAhVz0ePqns72_6652Dra34xWp4E/edit http://blog.esjworks.com core ideas behind using speech recognition for programming |
From: Paul M. <pt...@au...> - 2012-09-18 12:22:58
|
In your definition of 'text', you only match a single word - "name={LAN SIDE}" contains a text value with whitespace, more than one word. Also, beware of making 'text' match too much, you may end up including your trailing '}' as part of the text (as you will with your current definition of text as "Word(printables)"). -- Paul -----Original Message----- From: Claus Rosenberger [mailto:cla...@ro...] Sent: Tuesday, September 18, 2012 3:13 AM To: pyp...@li... Subject: [Pyparsing] Parsing object structure Hi, i try to parse an object structure which is using open and close tags and containing lists of another objects. Example: NetSet{ name={LAN SIDE} oid={123998333,723663,2625521122} readOnly={0} origin={} global={0} comment={Voice Server UC} list={ NetEntry{ name={} readOnly={0} origin={} global={0} comment={} addr={192.169.0.0/24} } } neglist={ } } I tried to start with following code structure and don't know if it makes sense or not. from pyparsing import * LBRACE,RBRACE,SEMI,EQ,PCT = map(Suppress,"{};=%") comment = SEMI + restOfLine keyName = Word(alphas) text = Word( alphanums, "-", ",") text = Word(printables) num = Word(nums) ip = Combine(Word(nums) + ('.' + Word(nums))*3) v_text = keyName + EQ + LBRACE + Optional(text) + RBRACE v_num = keyName + EQ + LBRACE + Optional(num) + RBRACE v_ip = keyName + EQ + LBRACE + Optional(ip) + RBRACE NETENTRY = ( "NetEntry" + LBRACE + ZeroOrMore(Group(v_text | v_num | v_ip)) + RBRACE ) NETSET_PLIST = ( "list" + EQ + LBRACE + ZeroOrMore(NETENTRY) + RBRACE ) NETSET_NLIST = "neglist" + EQ + LBRACE + ZeroOrMore(Group(NETENTRY)) + RBRACE NETSET = "NetSet" + LBRACE + ZeroOrMore(Group(v_text | v_num | NETSET_PLIST | NETSET_NLIST)) + RBRACE rule_ref = NETSET for mr in rule_ref.parseFile("sample"): print mr I get following result: pyparsing.ParseException: Expected "}" (at char 48), (line:2, col:25) It seems NETSET_PLIST does not work, perhaps my code is the wrong approach. Thanks a lot Claus ---------------------------------------------------------------------------- -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |
From: Paul M. <pt...@au...> - 2012-09-18 11:55:26
|
Sorry for my terse reply earlier - hit send too early! Eric, you have definitely taken on an ambitious first-project for pyparsing. Writing BNF's takes some practice, but it is important to really get your thoughts down about how the parser is supposed to work before getting mired down in Words, and Groups, Forwards, etc. In your nested terms, let the recursion in the BNF take care of nesting []'s - when you have LBRACK/RBRACK in two different levels of your nesting, it's a sign you should rethink just how you have defined the contents of this group. Here's my earlier post, with annotating comments. -- Paul all_tests = { "test_1": "some plain text", "test_2": "[simple ]", "test_3": "[simple_text some plain text]", "test_4": "[onearg [one ]]", "test_5": "[twoarg [one ] [two ]]", "test_6": "[onearg_text [one some plain text]]", "test_7": "[twoarg_text [one ] [two some plain text arg]]", "test_8": "[nested_text some [not plain] text]", "test_9": "[nested_text [one text] some [not [very ] plain] text]", "test_10": "[nested_text_escaped [one text] some [not [very ] plain] bracketed \[text\]]", "test_11": """[nested_text_escaped_indented [one text] some [not [very ] plain ] bracked \[text\] ]""", } # a simple BNF: # # listExpr ::= '[' listContent ']' # listContent ::= (contentsWord | escapedChar | listExpr)* # contentsWord ::= printableCharacter+ # # # Some notes: # 1. listContent could be empty, "[]" is a valid listExpr # 2. contentsWord cannot contain '\', '[' or ']' characters, or # else we couldn't distinguish delimiters from contents, or # detect escapes # from pyparsing import * # start with the basics LBRACK,RBRACK = map(Suppress,"[]") escapedChar = Combine('\\' + oneOf(list(printables))) contentsWord = Word(printables,excludeChars=r"\[]") # define a placeholder for a nested list, since we need to # reference it before it is fully defined listExpr = Forward() # the contents of a list is one or more contents words or lists listContent = ZeroOrMore(contentsWord | escapedChar | listExpr) # a list is a listContent enclosed in []'s - enclose # in a Group so that pyparsing will maintain the nested structure # # since listExpr was already defined as a Forward, we use '<<' to # "inject" the definition into the already defined Forward listExpr << Group(LBRACK + listContent + RBRACK) # parse the test string - note that the results no longer contain # the parsed '[' and ']' characters, but they do retain the # nesting of the original string in nested lists for name,testStr in all_tests.items(): print name, listContent.parseString(testStr).asList() prints: test_11 [['nested_text_escaped_indented', ['one', 'text'], 'some', ['not', ['very'], 'plain'], 'bracked', '\\[', 'text', '\\]']] test_10 [['nested_text_escaped', ['one', 'text'], 'some', ['not', ['very'], 'plain'], 'bracketed', '\\[', 'text', '\\]']] test_7 [['twoarg_text', ['one'], ['two', 'some', 'plain', 'text', 'arg']]] test_6 [['onearg_text', ['one', 'some', 'plain', 'text']]] test_5 [['twoarg', ['one'], ['two']]] test_4 [['onearg', ['one']]] test_3 [['simple_text', 'some', 'plain', 'text']] test_2 [['simple']] test_1 ['some', 'plain', 'text'] test_9 [['nested_text', ['one', 'text'], 'some', ['not', ['very'], 'plain'], 'text']] test_8 [['nested_text', 'some', ['not', 'plain'], 'text']] # pyparsing includes a short-cut to simplify defining nested # structures like this print nestedExpr('[',']').parseString(all_tests['test_9']).asList() |