Re: [Pyparsing] Remove dependency on xml.sax.saxutils?
Brought to you by:
ptmcg
From: Paul M. <pt...@au...> - 2008-06-04 17:41:59
|
Mike - Thanks for this submission, I see no reason why I wouldn't just drop this into the main pyparsing code - it seems to conditionalize around the presence/absence of xml.sax.saxutils very nicely. My question is more about just how minimal/lame xml.sax.saxutils.escape actually seems to be. In the list of common HTML entities defined later in pyparsing.py, I also include a mapping for '"' to """, but xml...escape does not handle that case. There is also handling of an optional dict, which if provided calls __dict_replace, which is not implemented. I think I am less interested in a verbatim copy of xml...escape than I am in having one that does a decent job of escaping - I think maybe I am more picky about this code since it would actually become part of the pyparsing source. So I think I will just discard importing and using xml.sax.saxutils.escape altogether, and replace it with xml_escape, which will be implemented as: def xml_escape(data): """Escape &, <, >, ", etc. in a string of data.""" # ampersand must be replaced first from_symbols = '&><"' to_symbols = ['&'+s+';' for s in "amp gt lt quot".split()] for from_,to_ in zip(from_symbols, to_symbols): data = data.replace(from_, to_) return data This handles the 4 special entities defined in HTML 2.0 (http://www.w3.org/MarkUp/html-spec/html-spec_9.html#SEC9.7). -- Paul (On further review, I see that I was erroneously mapping ' to " instead of " - I'll have that fix along with xml_escape posted to SVN shortly.) |