Re: [Pyparsing] use of Dict
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2008-02-16 17:24:52
|
Dict is not meant as "here is a dict entry with this particular keyword and this value." It is more meant as "here is a list of grouped entries and values, to be returned as a dict; take the first item of each group as the key, and the remaining items in each group as that key's value." In your case, a more likely definition would be: keylabel = oneOf("hello world") p = Dict(OneOrMore(Group(keylabel + (Word(nums) | Word(alphas, alphanums))))) results = p.parseString("hello abc world 2134") print results.keys() print results.dump() print results.hello The entries *must* be explicitly grouped, else the tokens will just run together and Dict wont know where values stop and the next key starts. In a larger grammar, the Dict expression is usually given a results name (say "dictVals") and then the entries in the dict can be referenced as "dictVals.hello" or "dictVals['world']" (using the keys from your example). I tried to simplify the use of Dict by providing the dictOf helper method. It would change the above to: keylabel = oneOf("hello world") p = dictOf( keylabel, (Word(nums) | Word(alphas, alphanums)) ) Where dictOf gets called with two expressions - the first is the expression for matching keys in the dict, and the second expression is for matching the values. It is atypical (but not impossible) to have a list of known keywords that would be keys. In the dictExample.py script, which ships in the pyparsing examples directory, the keys are labels in a table of data statistics: min, max, etc. These could have been hardcoded as oneOf("min max ave sdev"), but I could just reference them as Word(alphas), since their placement in the table was unambiguous. The configParse.py example uses nested Dicts to permit the values in an INI file to be referenced as "config.section.subsection.subsubsection.etc" -- Paul Here is the text of dictExample.py - please download either the source or docs distributions from SourceForge, to get the complete documentation and examples directories (not included when using easy_install or the Windows installer): # # dictExample.py # # Illustration of using pyparsing's Dict class to process tabular data # # Copyright (c) 2003, Paul McGuire # from pyparsing import Literal, Word, Group, Dict, ZeroOrMore, alphas, nums, delimitedList import pprint testData = """ +-------+------+------+------+------+------+------+------+------+ | | A1 | B1 | C1 | D1 | A2 | B2 | C2 | D2 | +=======+======+======+======+======+======+======+======+======+ | min | 7 | 43 | 7 | 15 | 82 | 98 | 1 | 37 | | max | 11 | 52 | 10 | 17 | 85 | 112 | 4 | 39 | | ave | 9 | 47 | 8 | 16 | 84 | 106 | 3 | 38 | | sdev | 1 | 3 | 1 | 1 | 1 | 3 | 1 | 1 | +-------+------+------+------+------+------+------+------+------+ """ # define grammar for datatable heading = (Literal( "+-------+------+------+------+------+------+------+------+------+") + "| | A1 | B1 | C1 | D1 | A2 | B2 | C2 | D2 |" + "+=======+======+======+======+======+======+======+======+======+").suppres s() vert = Literal("|").suppress() number = Word(nums) rowData = Group( vert + Word(alphas) + vert + delimitedList(number,"|") + vert ) trailing = Literal( "+-------+------+------+------+------+------+------+------+------+").suppres s() datatable = heading + Dict( ZeroOrMore(rowData) ) + trailing # now parse data and print results data = datatable.parseString(testData) print data pprint.pprint(data.asList()) print "data keys=", data.keys() print "data['min']=", data['min'] print "data.max", data.max |