Re: [Pyparsing] parsing SVG styles
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2007-12-09 17:54:45
|
Donn - This really is pretty well-suited to pyparsing, but you still have some basics to learn. - Why did you wrap "fill:#" in an Optional? If anything, it is a Literal, but in your grammar, you can just use the string itself. - parseString is suitable ONLY if you fully specify the grammar for the input string. Since you are trying pick matches out from amongst other noise, searchString or scanString are better choices. scanString returns a generator, which means you have to iterate over it with a for loop, or use something like the list constructor to convert to a list. scanString also returns the start and end locations for each match. In your case, you don't need this extra info, so just use the simpler searchString (searchString is just a wrapper around scanString - it discards the extra data, and just returns a list of the matches). - Your grammar was wrong in a few places. The # sign is a marker for the hex values in fill and stroke only, and is not used inthe fill-opacity or stroke-width commands. Since the # sign goes with the hex values, I included it as a suppressed prefix on hexNums, and removed it from the various command definitions. - Likewise, I defined a COLON as a Suppress(":"), so that the returned values have just the interesting names, with no trailing colons. - With these changes, searchString will return a list of key-value pairs. Note the easy way to change this to a dict, given at the end of the example. In short: - use the correct method for parsing or sifting or transforming data - look more closely at your input string to define your expressions properly Don't give up, this parsing stuff takes some getting used to, AND practice! -- Paul (my modified version, with comments) from pyparsing import * # Cover [1] or [0.587] floatOrInt = Combine(Word(nums) + Optional(Literal(".") + Word(nums))) # Cover any amount of hex, [ab][abcf] hexNums = Word(hexnums) # the #-sign is a prefix for hex numbers in SVG, so make it part of hexNums # instead of repeating it in each label that takes a hex value hexNums = Suppress("#") + Word(hexnums) # A semi-colon seps commands semi = Literal(";").suppress() COLON = Literal(":").suppress() # hacking the commands FILL_command = Optional("fill:#")# + Group(hexNums + semi) # why Optional here? FILL_command = "fill" + COLON + hexNums + semi FILLOPACITY_command = "fill-opacity:#" + floatOrInt + semi FILLOPACITY_command = "fill-opacity" + COLON + floatOrInt + semi STROKECOLOR_command = "stroke:#" + hexNums + semi STROKECOLOR_command = "stroke" + COLON + hexNums + semi STROKEWIDTH_command = "stroke-width:#" + floatOrInt + semi STROKEWIDTH_command = "stroke-width" + COLON + floatOrInt + semi # Trying to sum them up. Remarked down to one for testing stylecommand = FILL_command | FILLOPACITY_command | STROKECOLOR_command | STROKEWIDTH_command # Hacked during tests, tried to simplify. phrase2 = stylecommand #OneOrMore(Group(stylecommand)) #The test string style="opacity:1;color:#000000;fill:#6bdc23;fill-opacity:0.4611111;fill-rule :nonzero;stroke:#ff0000;stroke-width:6;stroke-linecap:butt;stroke-linejoin:m iter;marker:none;marker-start:none;marker-mid:none;marker-end:none;stroke-mi terlimit:2;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1;visibi lity:visible;display:inline;overflow:visible;enable-background:accumulate" #~ tokensStyle = phrase2.parseString(style) # I also tried scanString() # scanString returns a generator, do you know how to extract values # from a generator? This is a Python thing, not a pyparsing thing. # If you don't need extra info about each match (like the start and end # locations, just use seachString # parseString is clearly the wrong choice here, since you are picking # out selected matches from among other junk, use searchString is # the simplest tokensStyle = phrase2.searchString(style) print tokensStyle # Trying to get a result. for a in tokensStyle: print a ## for command in a: ## print ":",command # An easy way to convert searchString results to a dict, for this # example (since grammar returns each element as a key-value pair) print dict(tokensStyle.asList()) -----Original Message----- From: pyp...@li... [mailto:pyp...@li...] On Behalf Of Donn Ingle Sent: Sunday, December 09, 2007 7:03 AM To: pyp...@li... Subject: [Pyparsing] parsing SVG styles Hello again, I have now spent almost 3 hours on this and I've also looked at the examples and read the pdfs, but I just can't get this going. I actually find the docs and the adventure example too complicated -- I'm a simple sort. I'll post my code and hope for mercy -- as I've been told to rtfm before :) I'm trying to "pick-out" certain keywords (and args) from a string (style node in an SVG file) from amidst a babble of noise and just record those for later use. fill:#[6 hex nums]; fill-opacity:#[float or int]; stroke:#[6 hex nums]; stoke-width:#[float or int]; This is my latest test: # Cover [1] or [0.587] floatOrInt = Combine(Word(nums) + Optional(Literal(".") + Word(nums))) # Cover any amount of hex, [ab][abcf] hexNums = Word(hexnums) # A semi-colon seps commands semi = Literal(";").suppress() # hacking the commands FILL_command = Optional("fill:#")# + Group(hexNums + semi) FILLOPACITY_command = "fill-opacity:#" + floatOrInt + semi STROKECOLOR_command = "stroke:#" + hexNums + semi STROKEWIDTH_command = "stroke-width:#" + floatOrInt + semi # Trying to sum them up. Remarked down to one for testing stylecommand = FILL_command# | FILLOPACITY_command | STROKECOLOR_command | STROKEWIDTH_command # Hacked during tests, tried to simplify. phrase2 = stylecommand#OneOrMore(Group(stylecommand)) #The test string style="opacity:1;color:#000000;fill:#6bdc23;fill-opacity:0.4611111;fill-rule :nonzero;stroke:#ff0000;stroke-width:6;stroke-linecap:butt;stroke-linejoin:m iter;marker:none;marker-start:none;marker-mid:none;marker-end:none;stroke-mi terlimit:2;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:1;visibi lity:visible;display:inline;overflow:visible;enable-background:accumulate" tokensStyle = phrase2.parseString(style) # I also tried scanString() print tokensStyle # Trying to get a result. for a in tokensStyle: print a ## for command in a: ## print ":",command \d ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Pyparsing-users mailing list Pyp...@li... https://lists.sourceforge.net/lists/listinfo/pyparsing-users |