Re: [Pyparsing] Remove dependency on xml.sax.saxutils?
Brought to you by:
ptmcg
From: Michael D. <md...@st...> - 2008-06-04 18:04:14
|
Looks fine to me. Certainly addresses my original issue, and then some. Cheers, Mike Paul McGuire wrote: > Mike - > > Thanks for this submission, I see no reason why I wouldn't just drop this > into the main pyparsing code - it seems to conditionalize around the > presence/absence of xml.sax.saxutils very nicely. > > My question is more about just how minimal/lame xml.sax.saxutils.escape > actually seems to be. In the list of common HTML entities defined later in > pyparsing.py, I also include a mapping for '"' to """, but xml...escape > does not handle that case. There is also handling of an optional dict, > which if provided calls __dict_replace, which is not implemented. I think I > am less interested in a verbatim copy of xml...escape than I am in having > one that does a decent job of escaping - I think maybe I am more picky about > this code since it would actually become part of the pyparsing source. > > So I think I will just discard importing and using xml.sax.saxutils.escape > altogether, and replace it with xml_escape, which will be implemented as: > > def xml_escape(data): > """Escape &, <, >, ", etc. in a string of data.""" > > # ampersand must be replaced first > from_symbols = '&><"' > to_symbols = ['&'+s+';' for s in "amp gt lt quot".split()] > for from_,to_ in zip(from_symbols, to_symbols): > data = data.replace(from_, to_) > return data > > This handles the 4 special entities defined in HTML 2.0 > (http://www.w3.org/MarkUp/html-spec/html-spec_9.html#SEC9.7). > > -- Paul > > (On further review, I see that I was erroneously mapping ' to " instead > of " - I'll have that fix along with xml_escape posted to SVN shortly.) > > -- Michael Droettboom Science Software Branch Operations and Engineering Division Space Telescope Science Institute Operated by AURA for NASA |