Re: [Pyparsing] Remove dependency on xml.sax.saxutils?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Looks fine to me.  Certainly addresses my original issue, and then some.

Cheers,
Mike

Paul McGuire wrote:
> Mike -
>
> Thanks for this submission, I see no reason why I wouldn't just drop this
> into the main pyparsing code - it seems to conditionalize around the
> presence/absence of xml.sax.saxutils very nicely.
>
> My question is more about just how minimal/lame xml.sax.saxutils.escape
> actually seems to be.  In the list of common HTML entities defined later in
> pyparsing.py, I also include a mapping for '"' to "&quot;", but xml...escape
> does not handle that case.  There is also handling of an optional dict,
> which if provided calls __dict_replace, which is not implemented.  I think I
> am less interested in a verbatim copy of xml...escape than I am in having
> one that does a decent job of escaping - I think maybe I am more picky about
> this code since it would actually become part of the pyparsing source.  
>
> So I think I will just discard importing and using xml.sax.saxutils.escape
> altogether, and replace it with xml_escape, which will be implemented as:
>
>     def xml_escape(data):
>         """Escape &, <, >, ", etc. in a string of data."""
>
>         # ampersand must be replaced first
>         from_symbols = '&><"'
>         to_symbols = ['&'+s+';' for s in "amp gt lt quot".split()]
>         for from_,to_ in zip(from_symbols, to_symbols):
>             data = data.replace(from_, to_)
>         return data
>
> This handles the 4 special entities defined in HTML 2.0
> (http://www.w3.org/MarkUp/html-spec/html-spec_9.html#SEC9.7).
>
> -- Paul
>
> (On further review, I see that I was erroneously mapping ' to &quot; instead
> of " - I'll have that fix along with xml_escape posted to SVN shortly.)
>
>   

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA