pyparsing-users Mailing List for Python parsing module (Page 2)
Brought to you by:
ptmcg
You can subscribe to this list here.
| 2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(2) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2005 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(12) |
May
(2) |
Jun
|
Jul
|
Aug
(12) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
| 2006 |
Jan
(5) |
Feb
(1) |
Mar
(10) |
Apr
(3) |
May
(7) |
Jun
(2) |
Jul
(2) |
Aug
(7) |
Sep
(8) |
Oct
(17) |
Nov
|
Dec
(3) |
| 2007 |
Jan
(4) |
Feb
|
Mar
(10) |
Apr
|
May
(6) |
Jun
(11) |
Jul
(1) |
Aug
|
Sep
(19) |
Oct
(8) |
Nov
(32) |
Dec
(8) |
| 2008 |
Jan
(12) |
Feb
(6) |
Mar
(42) |
Apr
(47) |
May
(17) |
Jun
(15) |
Jul
(7) |
Aug
(2) |
Sep
(13) |
Oct
(6) |
Nov
(11) |
Dec
(3) |
| 2009 |
Jan
(2) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
(13) |
Jul
(19) |
Aug
(17) |
Sep
(8) |
Oct
(3) |
Nov
(7) |
Dec
(1) |
| 2010 |
Jan
(2) |
Feb
|
Mar
(19) |
Apr
(6) |
May
|
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
(2) |
| 2011 |
Jan
(4) |
Feb
|
Mar
(5) |
Apr
(1) |
May
(3) |
Jun
(8) |
Jul
(6) |
Aug
(8) |
Sep
(35) |
Oct
(1) |
Nov
(1) |
Dec
(2) |
| 2012 |
Jan
(2) |
Feb
|
Mar
(3) |
Apr
(4) |
May
|
Jun
(1) |
Jul
|
Aug
(6) |
Sep
(18) |
Oct
|
Nov
(1) |
Dec
|
| 2013 |
Jan
(7) |
Feb
(7) |
Mar
(1) |
Apr
(4) |
May
|
Jun
|
Jul
(1) |
Aug
(5) |
Sep
(3) |
Oct
(11) |
Nov
(3) |
Dec
|
| 2014 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
(6) |
May
(10) |
Jun
(4) |
Jul
|
Aug
(5) |
Sep
(2) |
Oct
(4) |
Nov
(1) |
Dec
|
| 2015 |
Jan
|
Feb
|
Mar
|
Apr
(13) |
May
(1) |
Jun
|
Jul
(2) |
Aug
|
Sep
(9) |
Oct
(2) |
Nov
(11) |
Dec
(2) |
| 2016 |
Jan
|
Feb
(3) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(4) |
| 2017 |
Jan
(2) |
Feb
(2) |
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
| 2018 |
Jan
(10) |
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
| 2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2020 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
| 2023 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
| 2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
|
From: <pt...@au...> - 2017-10-20 12:41:04
|
I looked at this briefly last week, I thought I had a working example of this in the parsePythonValue.py example, but it seems to have the same problem.
I thought I had worked out a working version at one time, using ungroup, but have not succeeded with that.
Here is an example of an expression and parse action that creates a list. Bear in mind that the return value from parseString is *always* a ParseResults, even if it is just `pp.pyparsing_common.integer.parseString("123")`.
Without the parse action, the parsed list is a ParseResults returned as the 0'th element of a ParseResults:
import pyparsing as pp
LBRACK, RBRACK = map(pp.Suppress, "[]")
item = pp.Word(pp.alphas)
list_expr = LBRACK + pp.delimitedList(item) + RBRACK
ret = pp.Group(list_expr).parseString("[a,b,c,d]")
print(ret)
print(type(ret))
print(ret[0])
print(type(ret[0]))
Gives:
[['a', 'b', 'c', 'd']]
<class 'pyparsing.ParseResults'>
['a', 'b', 'c', 'd']
<class 'pyparsing.ParseResults'>
Adding this parse action, we get a list returned as the 0'th element of a ParseResults. Note *no* Group around the list expression:
def make_list(tokens):
contents = tokens.asList()
tokens[:] = (contents,)
list_expr.addParseAction(make_list)
ret = list_expr.parseString("[a,b,c,d]")
print(ret)
print(type(ret))
print(ret[0])
print(type(ret[0]))
Gives:
[['a', 'b', 'c', 'd']]
<class 'pyparsing.ParseResults'>
['a', 'b', 'c', 'd']
<class 'list'>
Does that get you closer?
-- Paul
---- Athanasios Anastasiou <ath...@gm...> wrote:
> Hello everyone
>
> Any ideas on the attached?
>
> All the best
> AA
>
>
> ---------- Forwarded message ----------
> From: Athanasios Anastasiou <ath...@gm...>
> Date: Wed, Oct 4, 2017 at 11:44 AM
> Subject: Returning an actual `list`
> To: pyp...@li...
>
>
> Hello
>
> I have set up a very simple "primitive data type" parsing system to parse
> numbers, quoted strings, lists and dictionaries. The last two are following
> Python's convention.
>
> While this is working, I am having trouble returning specific data types
> from the "action" of the LIST definition.
>
> Here is a snippet:
>
> LIST = pyparsing.Forward()
> DICT = pyparsing.Forward()
>
> VALUE = (NUMBER|IDENTIFIER|DICT|pyparsing.Group(LIST))
>
> KEY_VALUE_PAIR = pyparsing.Group(IDENTIFIER("key") +
> pyparsing.Suppress(":") + VALUE("value"))
>
> LIST << pyparsing.Suppress("[") + pyparsing.delimitedList(VALUE) +
> pyparsing.Suppress("]")
> DICT << pyparsing.Suppress("{") + pyparsing.delimitedList(KEY_VALUE_PAIR) +
> pyparsing.Suppress("}")
>
> So, when you are trying to parse something like: [1,2,3,4,[5,6]], this is
> returned as a pyparsing.ParseResults type of object, rather than a `list`.
> I have tried to set a simple parseAction with `lambda s,l,t: list(t)` or
> `lambda s,l,t:list(t[0])` but I am still getting back a ParseResults object.
>
> Ideally, I would like the rule to return a list, just like the INT rule
> returns a proper Python integer.
>
> Any ideas about what am I missing?
>
> All the best
> AA
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Pyparsing-users mailing list
> Pyp...@li...
> https://lists.sourceforge.net/lists/listinfo/pyparsing-users
|
|
From: Athanasios A. <ath...@gm...> - 2017-10-20 10:08:03
|
Hello everyone
Any ideas on the attached?
All the best
AA
---------- Forwarded message ----------
From: Athanasios Anastasiou <ath...@gm...>
Date: Wed, Oct 4, 2017 at 11:44 AM
Subject: Returning an actual `list`
To: pyp...@li...
Hello
I have set up a very simple "primitive data type" parsing system to parse
numbers, quoted strings, lists and dictionaries. The last two are following
Python's convention.
While this is working, I am having trouble returning specific data types
from the "action" of the LIST definition.
Here is a snippet:
LIST = pyparsing.Forward()
DICT = pyparsing.Forward()
VALUE = (NUMBER|IDENTIFIER|DICT|pyparsing.Group(LIST))
KEY_VALUE_PAIR = pyparsing.Group(IDENTIFIER("key") +
pyparsing.Suppress(":") + VALUE("value"))
LIST << pyparsing.Suppress("[") + pyparsing.delimitedList(VALUE) +
pyparsing.Suppress("]")
DICT << pyparsing.Suppress("{") + pyparsing.delimitedList(KEY_VALUE_PAIR) +
pyparsing.Suppress("}")
So, when you are trying to parse something like: [1,2,3,4,[5,6]], this is
returned as a pyparsing.ParseResults type of object, rather than a `list`.
I have tried to set a simple parseAction with `lambda s,l,t: list(t)` or
`lambda s,l,t:list(t[0])` but I am still getting back a ParseResults object.
Ideally, I would like the rule to return a list, just like the INT rule
returns a proper Python integer.
Any ideas about what am I missing?
All the best
AA
|
|
From: Athanasios A. <ath...@gm...> - 2017-10-04 10:44:35
|
Hello
I have set up a very simple "primitive data type" parsing system to parse
numbers, quoted strings, lists and dictionaries. The last two are following
Python's convention.
While this is working, I am having trouble returning specific data types
from the "action" of the LIST definition.
Here is a snippet:
LIST = pyparsing.Forward()
DICT = pyparsing.Forward()
VALUE = (NUMBER|IDENTIFIER|DICT|pyparsing.Group(LIST))
KEY_VALUE_PAIR = pyparsing.Group(IDENTIFIER("key") +
pyparsing.Suppress(":") + VALUE("value"))
LIST << pyparsing.Suppress("[") + pyparsing.delimitedList(VALUE) +
pyparsing.Suppress("]")
DICT << pyparsing.Suppress("{") + pyparsing.delimitedList(KEY_VALUE_PAIR) +
pyparsing.Suppress("}")
So, when you are trying to parse something like: [1,2,3,4,[5,6]], this is
returned as a pyparsing.ParseResults type of object, rather than a `list`.
I have tried to set a simple parseAction with `lambda s,l,t: list(t)` or
`lambda s,l,t:list(t[0])` but I am still getting back a ParseResults object.
Ideally, I would like the rule to return a list, just like the INT rule
returns a proper Python integer.
Any ideas about what am I missing?
All the best
AA
|
|
From: Gre7g L. <haf...@sy...> - 2017-10-01 11:05:20
|
hey Pyparsing http://bit.ly/2wqhuuF |
|
From: Evan H. <eva...@gm...> - 2017-07-18 07:14:57
|
I rewrote PyParsing in Cython as part of my efforts to speed up Coconut <http://coconut-lang.org/> and Undebt <https://github.com/Yelp/undebt>. You can check it out on GitHub here <https://github.com/evhub/cpyparsing>, or download it from PyPI with `pip install cpyparsing`. In my testing with Coconut, I've found it to be about 30% faster, so if you're looking for maximum PyParsing performance, you might want to check it out. Right now, it uses very few Cython features, so there's still a lot of room for improvement. If anyone wants to try strategically moving functions over to use `cdef` or `cpdef` and submit a PR, that would be much appreciated! Cheers, Evan Hubinger -- “The true logic of this world is in the calculus of probabilities.” – James Clerk Maxwell |
|
From: Eric D. <eri...@gm...> - 2017-07-11 12:07:17
|
That looks like it should work! I tried something like that earlier but
couldn't quite get it working right, but I think it had to do with
namespace issues more than anything.
On 11 Jul 2017 12:43 am, "Evan Hubinger" <eva...@gm...> wrote:
> How about this:
>
> from pyparsing import ParserElement, Literal
> ParserElement.setDefaultWhitespaceChars(" \t\r\f\v")
> newline = Literal("\n")
>
> On Mon, Jul 10, 2017 at 2:40 PM, Eric Dilmore <eri...@gm...>
> wrote:
>
>> What is the best way to create a newline-sensitive language (like Python
>> itself)? I continually run into problems where the newlines are eaten by
>> the automated whitespace-grabber, and I've found what I believe to be two
>> working solutions:
>>
>> - Attach .setWhitespaceChars(' \t') to the end of all of the low-level
>> parsers so that, when combined, they don't gobble up newlines.
>> - Add White(' \t') between all low-level parsers, such that they eat
>> non-newline whitespace but not the newlines.
>>
>> Neither one of these is particularly elegant. It would be nice if
>> .setWhitespaceChars propagated down to children (like .ignore does), but I
>> don't expect it to be changed anytime soon because it would break backwards
>> compatibility pretty bad for anyone who was using either one of these
>> methods of producing newline-dependent languages.
>>
>> Which of these two is preferable? Or, alternatively, is there an even
>> better way that I haven't thought of yet?
>>
>> Thanks in advance!
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Pyparsing-users mailing list
>> Pyp...@li...
>> https://lists.sourceforge.net/lists/listinfo/pyparsing-users
>>
>
>
>
> --
> “The true logic of this world is in the calculus of probabilities.” –
> James Clerk Maxwell
>
|
|
From: Evan H. <eva...@gm...> - 2017-07-11 05:43:07
|
How about this:
from pyparsing import ParserElement, Literal
ParserElement.setDefaultWhitespaceChars(" \t\r\f\v")
newline = Literal("\n")
On Mon, Jul 10, 2017 at 2:40 PM, Eric Dilmore <eri...@gm...> wrote:
> What is the best way to create a newline-sensitive language (like Python
> itself)? I continually run into problems where the newlines are eaten by
> the automated whitespace-grabber, and I've found what I believe to be two
> working solutions:
>
> - Attach .setWhitespaceChars(' \t') to the end of all of the low-level
> parsers so that, when combined, they don't gobble up newlines.
> - Add White(' \t') between all low-level parsers, such that they eat
> non-newline whitespace but not the newlines.
>
> Neither one of these is particularly elegant. It would be nice if
> .setWhitespaceChars propagated down to children (like .ignore does), but I
> don't expect it to be changed anytime soon because it would break backwards
> compatibility pretty bad for anyone who was using either one of these
> methods of producing newline-dependent languages.
>
> Which of these two is preferable? Or, alternatively, is there an even
> better way that I haven't thought of yet?
>
> Thanks in advance!
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Pyparsing-users mailing list
> Pyp...@li...
> https://lists.sourceforge.net/lists/listinfo/pyparsing-users
>
--
“The true logic of this world is in the calculus of probabilities.” – James
Clerk Maxwell
|
|
From: Eric D. <eri...@gm...> - 2017-07-10 21:40:13
|
What is the best way to create a newline-sensitive language (like Python
itself)? I continually run into problems where the newlines are eaten by
the automated whitespace-grabber, and I've found what I believe to be
two working solutions:
- Attach .setWhitespaceChars(' \t') to the end of all of the low-level
parsers so that, when combined, they don't gobble up newlines.
- Add White(' \t') between all low-level parsers, such that they eat
non-newline whitespace but not the newlines.
Neither one of these is particularly elegant. It would be nice if
.setWhitespaceChars propagated down to children (like .ignore does), but
I don't expect it to be changed anytime soon because it would break
backwards compatibility pretty bad for anyone who was using either one
of these methods of producing newline-dependent languages.
Which of these two is preferable? Or, alternatively, is there an even
better way that I haven't thought of yet?
Thanks in advance!
|
|
From: agtilden <agt...@gu...> - 2017-03-16 19:15:56
|
hi Pyparsing http://www.acopicaldas.org.co/add2cart.php?mary=pasgs250c8qtb4 agtilden |
|
From: u0021953 <u00...@sa...> - 2017-03-01 05:06:59
|
Dear Customer, Your item has arrived at February 25, but our courier was not able to deliver the parcel. Please check the attachment for complete details! With thanks and appreciation, Melvin Meeks, UPS Support Clerk. |
|
From: Ralph C. <ra...@in...> - 2017-02-26 12:11:05
|
Hi Malahal,
> export {
> path = path1;
> key2 = value2;
> client {
> clientid = value3;
> key4 = value4;
> }
> client {
> clientid = value6;
> }
> }
...
> Given an export path, I would like to fetch the corresponding export
> block.
This sounds more like a Python programming problem than Pyparsing. You
want to build a `dict' indexed by an export's `path' as you're parsing.
https://docs.python.org/3/tutorial/datastructures.html#dictionaries
The value for a key would be a data of your design that describes an
export. Lookup would then be
exports['path1']
> Similarly, for a given path and clientid, I would like to get the
> corresponding "client" block quickly
Same here, except you have a dict of dicts.
clients['path1']['value3']
Or you might make the information available as part of your exports dict
if the value was an object with a `clients' attribute that was a dict.
exports['path1'].clients['value3']
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
|
|
From: Malahal N. <ma...@gm...> - 2017-02-26 05:04:10
|
I am trying to implement a config editor for a project. The config
syntax is very simple. It consists of "key value" pairs in blocks.
Blocks may have sub-blocks (only one level at this time).
An example:
export {
path = path1;
key2 = value2;
client {
clientid = value3;
key4 = value4;
}
client {
clientid = value6;
}
}
The "export" is uniquely identified by a specific "path" value. Same
is the case with "client" block inside the "export" block which is
identified by the "clientid" value.
Given an export path, I would like to fetch the corresponding export
block. Of course, I can easily match all exports and search each such
block for a matching "path". Similarly, for a given path and clientid,
I would like to get the corresponding "client" block quickly as there
may be 1000's of these sub-blocks in the export block.
Regards, Malahal.
|
|
From: E Y. <you...@gm...> - 2017-01-15 09:41:22
|
Hi all, it took a while, but I managed to complete an initial working version of the PLY parser thanks to Paul's suggestions. The self-contained example you can find here: https://gist.github.com/youngec/76c5b552ab5bc04b43d1bfa9d3fa0a78. Its only dependency is the python package attrs (https://pypi.org/pypi/attrs), because it makes python classes a bit easier to manage. You're free to use that parser as you wish. The parser is part of a larger project at https://github.com/youngec/rootspace/tree/develop, and the respective unit tests are at https://github.com/youngec/rootspace/blob/develop/src/rootspace/tests/test_parsers.py . Theres one big caveat to the parser though, and I'm hoping you can provide some guidance: Currently, it only understands ASCII-based PLY files, but doesn't understand the two much more common binary formats (little- and big-endian). Can pyparsing handle binary data? Thanks so much! Best, Ellie On 24 December 2016 at 06:48, <pt...@au...> wrote: > > > > first off, I would like to thank the developer(s) for making such a great > > tool. It's such an elegant piece of work! > > Thank you - flattery is always a good start! I'm glad pyparsing has been > helpful for you, and that you find it to your liking. > > Kudos on your posted parser - starting with posting your BNF! So many > people skip this step, and then get mired down in distracting details, and > find they have overlooked or skipped significant parts. I'm also glad to > see that you are using the new pyparsing_common expressions. I had some > mixed feelings about including them, and your sample code shows them being > put to good use. > > > > > Though, I'm happy if somebody can provide some help. I'm trying to > > implement a Standford polygon file parser ( > > http://paulbourke.net/dataformats/ply/), but I'm having trouble getting > it > > to parse the file into the correct data structure, because parts of the > > file depend on depend on declarations earlier in the same file. > > > > This happens often enough, and there are several examples of such adaptive > parsers that I have posted over the years. The main concept is to define a > Forward for the part of the parser that needs to be adaptive, and then to > use a parse action attached to the part with the format description to > create the flexible part of the parser, and to then insert it into the > placeholder Forward using the '<<' operator. > > I touched up your parser from the gist that you posted, and have added > some sample code that will parse the test cube example. You can find it > here: https://gist.github.com/anonymous/4498de5ce91da8cf4292bad408076b97 > I think you'll be able to take it from there. (I also applied some > stylistic changes - removing explicit .setResultsName calls in favor of > implicit callable notation, changing CaselessLiteral to CaselessKeyword, a > Group added here or there for better structure and results name assignment. > Feel free to keep any, all, or none of these changes.) > > Please post a link to your finished project to this list, or the Pyparsing > Facebook page. > > Regards, > -- Paul McGuire > > > > > |
|
From: <pt...@au...> - 2017-01-02 13:03:12
|
Interesting idea, but it *will* spew just a ton of logging! In fact, it will be mostly gibberish unless you have used setName() throughout your code to give meaningful labels to your grammar expressions. But I'll add it to my "to-do" list for the next release. -- Paul |
|
From: Mike S. <mik...@co...> - 2016-12-31 21:03:07
|
Sometimes it is annoying to try and figure out which elements need a setDebug() to figure out what went wrong. It would be nice to do: ParserElement.setDefaultDebug(True) ... ParserElement.setDefaultDebug(False) just like you can do: ParserElement.setDefaultWhitespaceChars() Then, I could just put this at the top before parser construction, and there isn't anything I would miss (or, at least, nothing in my source). Pre-built things would still not have debug set, like lineEnd for example; but that's ok. This more closely matches the feature of yacc/bison debugging, which will spray a ton of detail. Handy when just turn on all debugging, find and fix the issue, then turn it off. Easy to do with a one line change in a larger grammar. |
|
From: Mike S. <mik...@co...> - 2016-12-31 20:58:01
|
The web site at: http://pyparsing.wikispaces.com/GettingHelp says: You can e-mail your message to pyparsing(a)lists.sourceforge.net but this is wrong. It should likely be: You can e-mail your message to pyparsing-users(a)lists.sourceforge.net |
|
From: <pt...@au...> - 2016-12-24 05:48:49
|
> > first off, I would like to thank the developer(s) for making such a great > tool. It's such an elegant piece of work! Thank you - flattery is always a good start! I'm glad pyparsing has been helpful for you, and that you find it to your liking. Kudos on your posted parser - starting with posting your BNF! So many people skip this step, and then get mired down in distracting details, and find they have overlooked or skipped significant parts. I'm also glad to see that you are using the new pyparsing_common expressions. I had some mixed feelings about including them, and your sample code shows them being put to good use. > > Though, I'm happy if somebody can provide some help. I'm trying to > implement a Standford polygon file parser ( > http://paulbourke.net/dataformats/ply/), but I'm having trouble getting it > to parse the file into the correct data structure, because parts of the > file depend on depend on declarations earlier in the same file. > This happens often enough, and there are several examples of such adaptive parsers that I have posted over the years. The main concept is to define a Forward for the part of the parser that needs to be adaptive, and then to use a parse action attached to the part with the format description to create the flexible part of the parser, and to then insert it into the placeholder Forward using the '<<' operator. I touched up your parser from the gist that you posted, and have added some sample code that will parse the test cube example. You can find it here: https://gist.github.com/anonymous/4498de5ce91da8cf4292bad408076b97 I think you'll be able to take it from there. (I also applied some stylistic changes - removing explicit .setResultsName calls in favor of implicit callable notation, changing CaselessLiteral to CaselessKeyword, a Group added here or there for better structure and results name assignment. Feel free to keep any, all, or none of these changes.) Please post a link to your finished project to this list, or the Pyparsing Facebook page. Regards, -- Paul McGuire |
|
From: E Y. <you...@gm...> - 2016-12-23 09:24:39
|
Dear All, first off, I would like to thank the developer(s) for making such a great tool. It's such an elegant piece of work! Though, I'm happy if somebody can provide some help. I'm trying to implement a Standford polygon file parser ( http://paulbourke.net/dataformats/ply/), but I'm having trouble getting it to parse the file into the correct data structure, because parts of the file depend on depend on declarations earlier in the same file. For those not in the know, the PLY file format has a header and a body, as seen in the example posted on the aforementioned link. The header declares how the body is supposed to be parsed and the body consists only of numbers (either textual or binary). The declarations dictate the order of the data in the body. My attempt at a context free grammar is as follows: ply_grammar ::= header body header ::= "ply" declaration+ "end_header" declaration ::= format | element | property format ::= "format" format_type NUMBER element ::= "element" element_type NUMBER property ::= ("property" property_type IDENT) | ("property" "list" property_type property_type IDENT) format_type ::= "vertex" | "face" | "edge" | IDENT property_type ::= "char" | "uchar" | "short" | "ushort" | "int" | "uint" | "float" | "double" body ::= statement+ statement ::= NUMBER+ The code is available at: https://gist.github.com/anonymous/f7fee82634ba224e25e9022ec2c3c890 So far, I managed to parse the file header such that each declared element has nested property declarations, but I'm unable to tell pyparsing how to parse the body data accordingly. In the end, I would like to have instances of numpy.ndarray or array.array for each element declared in the header. Any help is greatly appreciated! Best, E |
|
From: agtilden <agt...@th...> - 2016-11-07 14:55:20
|
good morning Pyparsing http://hicondisplays.com/lips.php?spread=2n1ffyb19rv1cq agtilden |
|
From: simplevolk <sim...@gm...> - 2016-10-27 13:04:31
|
Greetings! I search a tool or library wich can parse file from specific pattern: http://pastebin.com/3sDD1wyW. I have a lot of files with this pattern (and some others). So, can Pyparsing parse this? And how difficult could it be? I should parse all data,except data with 'г===============================' row. Thank you! |
|
From: Shane M. <sha...@gm...> - 2016-07-09 09:45:35
|
I'm developing a parser for the Graphviz DOT language and am having
problems with my STMT expression in the grammar fragment below.
In this simplified grammar a STMT can be either a SUBGRAPH or a NODE_STMT.
An example SUBGRAPH expression is "*subgraph cluster01 { n003 ; n004 ; }*"
which is as you can see a composite statement.
My problem is that whilst the SUBGRAPH expression will happily accept the
test example, the STMT expression will not though it is defined as below:
*STMT = SUBGRAPH("SUBGRAPH") ^ NODE_STMT("NODE")*
and the test code runs:
*Testing subgraph statements*
*Match SUBGRAPH at loc 0(1,1)*
*Match STMT_LIST at loc 20(1,21)*
*Matched STMT_LIST -> ['n003', 'n004']*
*Matched SUBGRAPH -> [['subgraph', 'cluster01'], ['n003', 'n004']]*
*([(['subgraph', 'cluster01'], {'SUBGRAPHNAME': [('cluster01', 1)]}),
(['n003', 'n004'], {'NODE': [('n003', 0), ('n004', 1)], 'STMT': [('n003',
0), ('n004', 1)]})], {})*
*Match STMT at loc 0(1,1)*
*Matched STMT -> ['subgraph']*
*Problem Test Sample: LINE= 1 COL= 10*
*subgraph cluster01 { n003 ; n004 ; }*
*ERROR: Expected end of text (at char 9), (line:1, col:10)*
My "belief" is that the STMT expression should preferentially match the
SUBGRAPH expression rather than the NODE_STMT expression but clearly is not.
What am I missing?
BTW - StackOverflow points available :
http://stackoverflow.com/questions/38258218/suspected-pyparsing-longest-match-error
Thanks :-)
*Grammar below:*
LCURL = Literal("{").suppress()
RCURL = Literal("}").suppress()
STMTSEP = Literal(";").suppress()
ID = Word(alphas, alphanums + "_")
SUBGRAPH_KW = Keyword("subgraph", caseless=True)
SUBGRAPH = Forward("SUBGRAPH")
NODE_ID = ID("NODE_ID")
NODE_STMT = NODE_ID("NODE")
STMT = SUBGRAPH("SUBGRAPH") ^ NODE_STMT("NODE")
STMT_LIST = ZeroOrMore(STMT("STMT") + Optional(STMTSEP))
SUBGRAPH << Group(SUBGRAPH_KW + ID("SUBGRAPHNAME")) + Group(LCURL +
STMT_LIST + RCURL)
######################################################
SUBGRAPH.setName("SUBGRAPH")
STMT.setName("STMT")
STMT_LIST.setName("STMT_LIST")
NODE_STMT.setName("NODE_STMT")
ID.setName("ID")
######################################################
print("Testing subgraph statements")
test_ids = [
'''subgraph cluster01 { n003 ; n004 ; }'''
]
################
FRAG_1 = STMT + StringEnd()
################
NODE_STMT.setDebug(True)
SUBGRAPH.setDebug(True)
ID.setDebug(True)
STMT.setDebug(True)
STMT_LIST.setDebug(True)
for test in test_ids:
try:
result = FRAG_1.parseString(test)
pprint.pprint(result)
except ParseException, e:
print("Problem Test Sample: LINE= %s COL= %s" % (e.lineno, e.col))
print (e.line)
print (" " * (e.column - 1) + "^")
print("ERROR: %s" % str(e))
|
|
From: Paul M. <pt...@au...> - 2016-07-05 19:37:14
|
Andrea -
While the convential BNF approach is usually to use left-recursion for these
types of expressions, pyparsing does not do well with left-recursion.
Instead, you are better off using ZeroOrMore/OneOrMore for the repetition:
indexed_reference = identifier + ZeroOrMore('[' + integer_expr + ']')
This actually can be more complicated, since the integer_expr could itself
be an indexed_reference, if one array contains indexes into the other:
array1[array2[0]]
You may also want to account for slice notation (assuming this is Python
that you are parsing):
slice_expr = '[' + integer_expr + Optional(':' + integer_expr +
Optional(':' + integer_expr)) + ']'
indexed_reference = identifier + ZeroOrMore(slice_expr)
This has been a busy week, so I've not had more chance to delve into this in
detail. I know that I've written a pyparsing generator (in pyparsing) that
reads the Python BNF and creates a pyparsing parser for it. I'll have to
dust this off and see how this is done in that code (you can also look for
yourself, I'm pretty sure it is in the shipped/online examples).
-- Paul
-----Original Message-----
From: Andrea Censi [mailto:ce...@mi...]
Sent: Saturday, July 02, 2016 6:17 PM
To: pyp...@li...
Subject: [Pyparsing] parsing grammars with suffixes (array indexing,
attributes)
Hi,
I have been using PyParsing for some major projects so far, but I still
don't understand how to properly parse some specific types of recursive
grammars, for which the function operatorPrecedence does not help.
In particular, take Python's syntax for indexing using square brackets.
These are some examples:
array[0]
array[0][1]
array[0][1][2]
I understand why the following solution does not work:
value = Forward()
value_ref = Word(alphas)
value_index = value + "[" + index + "]"
value_ ... = ...
....
value << (value_ref ^ value_index ^ value_... ^ ...)
(This is a very simplified version of what I am dealing with, in which I
have about a dozen possible expressions for value).
I understand why something like the above gives an infinite recursion
exception. The problem is clear. Yet I don't see the solution.
Another example that appears also in python is dealing with attributes:
object
object.attribute
object.attribute.attribute2
(expression ...).attribute
I suspect that this is a general problem that people have.
Any hints?
thanks,
A.
----------------------------------------------------------------------------
--
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Pyparsing-users mailing list
Pyp...@li...
https://lists.sourceforge.net/lists/listinfo/pyparsing-users
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
|
|
From: Andrea C. <ce...@mi...> - 2016-07-02 23:17:29
|
Hi, I have been using PyParsing for some major projects so far, but I still don't understand how to properly parse some specific types of recursive grammars, for which the function operatorPrecedence does not help. In particular, take Python's syntax for indexing using square brackets. These are some examples: array[0] array[0][1] array[0][1][2] I understand why the following solution does not work: value = Forward() value_ref = Word(alphas) value_index = value + "[" + index + "]" value_ ... = ... .... value << (value_ref ^ value_index ^ value_... ^ ...) (This is a very simplified version of what I am dealing with, in which I have about a dozen possible expressions for value). I understand why something like the above gives an infinite recursion exception. The problem is clear. Yet I don't see the solution. Another example that appears also in python is dealing with attributes: object object.attribute object.attribute.attribute2 (expression ...).attribute I suspect that this is a general problem that people have. Any hints? thanks, A. |
|
From: Elizabeth M. <eli...@in...> - 2016-02-23 11:40:35
|
On 20/02/16 10:36, Paul McGuire wrote:
> Here is the whole parser in one copy/pasteable chunk:
>
> command = Word(alphas) | Word(nums)
> COLON = Suppress(':')
> middle = ~COLON + Word(printables)
> trailing = Word(printables)
> params = (OneOrMore(middle)("middle") +
> COLON +
> ZeroOrMore(trailing)("trailing"))
> line = (command("command") + Group(params)("params"))
>
> tests = """\
> COMMAND param1 param2 : param3"""
> line.runTests(tests)
>
> And no need to kludge in any `listAllMatches` behavior either.
>
> -- Paul
>
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
There was one minor thing I forgot. Word(printables) is insufficient in
your example, as any 8-bit string is acceptable in parameters (and are
often used).
The encoding is not specified for these portions (deliberately it
seems), but I usually implicitly assume UTF-8, since that is the
de-facto standard. As such, UTF-8 is what I decode all stuff from the
wire to, using the 'replace' error handler.
This is my proposed solution to match all characters but surely there is
a better way than the below (perhaps using Regex is better?):
# 1111998 is the total number of valid Unicode characters.
utf8_chars = ''.join(chr(x) for x in range(1111998))
middle = (~COLON + ~White()) + Word(utf8_chars)
--
Elizabeth
|
|
From: Paul M. <pt...@au...> - 2016-02-20 16:36:29
|
Elizabeth -
Googling for RFC1429, I found this BNF, which looks like what you are
working from:
https://tools.ietf.org/html/rfc1459#section-2.3.1
<message> ::= [':' <prefix> <SPACE> ] <command> <params> <crlf>
<prefix> ::= <servername> | <nick> [ '!' <user> ] [ '@' <host> ]
<command> ::= <letter> { <letter> } | <number> <number> <number>
<SPACE> ::= ' ' { ' ' }
<params> ::= <SPACE> [ ':' <trailing> | <middle> <params> ]
<middle> ::= <Any *non-empty* sequence of octets not including SPACE
or NUL or CR or LF, the first of which may not be ':'>
<trailing> ::= <Any, possibly *empty*, sequence of octets not including
NUL or CR or LF>
<crlf> ::= CR LF
>From this BNF, I came up with this translation to pyparsing, very similar to
yours:
COLON = Suppress(':')
command = Word(alphas) | Word(nums, exact=3)
middle = ~COLON + Word(printables)
trailing = Word(printables)
params = Forward()
params <<= COLON + trailing | middle + params
I usually leave the assignment of results names until the very end, just
assigning them in the expressions where they get composed into groups or the
top-most parse expression.
line = command("command") + Group(params)("params")
tests = """\
COMMAND param1 param2 : param3"""
line.runTests(tests)
And this gives:
COMMAND param1 param2 : param3
['COMMAND', ['param1', 'param2', 'param3']]
- command: COMMAND
- params: ['param1', 'param2', 'param3']
This is something of a problem, since we have lost the distinction of which
part of the params are the middle and which are the trailing. The issue is
that recursive definition of params, which you pointed out makes the results
awkward to work with. The best I could do here was to define params using:
params <<= (COLON + trailing("trailing") |
middle("middle*") + params)
(I'm using the abbreviated version of `setResultsName`, using the
expressions as callables - the trailing '*' in "middle*" is equivalent to
`middle.setResultsName("middle", listAllMatches=True)`. And as you probably
already discovered, if `listAllMatches` is left out, then you will only get
the last element of `middle`.)
With this change, I get:
COMMAND param1 param2 : param3
['COMMAND', ['param1', 'param2', 'param3']]
- command: COMMAND
- params: ['param1', 'param2', 'param3']
- middle: [['param1'], ['param2']]
[0]:
['param1']
[1]:
['param2']
- trailing: param3
Which is *okay* but not really pleasant to deal with that middle bit.
But I'd like to look at this recursive construct in the original BNF:
<params> ::= <SPACE> [ ':' <trailing> | <middle> <params> ]
This is very typical in many BNFs, which will define a repetition of one or
more items as:
<list_of_items> ::= <item> [ <list_of_items> ]
This *can* be implemented in pyparsing as:
list_of_items = Forward()
list_of_items <<= item + list_of_items
But you'll find in pyparsing that things are usually clearer (and faster)
when you define repetition using the OneOrMore or ZeroOrMore classes:
list_of_items = OneOrMore(item)
If we use a repetition expression instead of a recursive expression for
params, it looks like this:
params = (OneOrMore(middle)("middle") +
COLON +
ZeroOrMore(trailing)("trailing"))
And the parsed test string gives:
COMMAND param1 param2 : param3
['COMMAND', ['param1', 'param2', 'param3']]
- command: COMMAND
- params: ['param1', 'param2', 'param3']
- middle: ['param1', 'param2']
- trailing: ['param3']
Here is the whole parser in one copy/pasteable chunk:
command = Word(alphas) | Word(nums)
COLON = Suppress(':')
middle = ~COLON + Word(printables)
trailing = Word(printables)
params = (OneOrMore(middle)("middle") +
COLON +
ZeroOrMore(trailing)("trailing"))
line = (command("command") + Group(params)("params"))
tests = """\
COMMAND param1 param2 : param3"""
line.runTests(tests)
And no need to kludge in any `listAllMatches` behavior either.
-- Paul
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
|