|
From: Waylan L. <wa...@gm...> - 2007-11-05 05:23:55
|
I've just committed a patch to svn (r53) that provides a nice middle
ground to the escaping vs. removing html issue. The old behavior is
still the default, but escaping is provided as an option. Currently,
the global variable `HTML_REMOVED_TEXT` holds the text that is used
for replacement. I set it up so that if that string is empty (or
otherwise evaluates to `False` in python) then the html is escaped
instead. In other words, you turn escaping on in the same way that you
change the replacement text. Here's an example:
>>> import markdown
>>> markdown.HTML_REMOVED_TEXT = ''
>>> md = markdown.Markdown(safe_mode=True)
>>> md.convert('<a href="foo">foo</a> bar.')
'<p><a href="foo">foo</a> bar.\n</p>'
I left the default as the old behavior, but that could easily be
switched. I also considered adding a new global (perhaps
`ESCAPE_HTML`) which would simply hold a True/False value, but
couldn't see adding an additional variable. If anyone feels otherwise,
let me know.
I see one potential problem with my solution which I hadn't considered
until just now (after committing my patch). One could already have
code that sets `HTML_REMOVED_TEXT` to an empty string so that all html
is stripped and replaced with nothing. Some may prefer such a
behavior. This makes that imposable to do. Is anyone doing this?
Adding `ESCAPE_HTML` would address this issue, if it is one.
Another solution would be to change the expected values of the
`safe_mode` parameter for Markdown() to one of 'strip', 'escape', or
None rather than True/False. But that could get complicated/confusing.
Oh, and obviously, the value of `HTML_REMOVED_TEXT` can be changed in
the source file if one will always want that behavior. That can become
a headache on upgrading to a new version though. Its usually better to
future-proof your code IMO.
I should also mention that I also moved the code that does the
escaping/removing from the convert method to a text-post-processor. It
makes more sense there regardless of this change IMO and simplifies
the process of making your own extension to change the behavior.
Extensions would be another way to address the issues I mention above.
Perhaps we could just leave it at that.
The escaping is very basic. Any improvements are welcome. Anyone know
of a method already available in the python standard lib?
Any objections, comments, suggestions are welcome.
On 6/12/07, Yuri Takhteyev <qar...@gm...> wrote:
> You should be able to do this with a preprocessor by simply
> pre-escaping all HTML, no? Alternatively, if you want a quick and
> dirty hack, look for the line that says:
>
> if self.safeMode and html != "<hr />" and html != "<br />":
> html = HTML_REMOVED_TEXT
>
> I do agree though that perhaps escaping html would be a better
> default. (Please do file a bug on sourceforge so that I don't forget
> to make this change later.) In the long term, perhaps, the new and
> more flexible way of managing pre-post-etc-processors would solve this
> problem as well.
>
> > implementation removes HTML . (like why HTML is escaped in code blocks
> > and not fully removed) ..
>
> An oversight on my part...
>
> > P.S.: @Yuri Takhteyev: i guess you don't really care any more since
> > you've already put up a wiki .. but anyway .. http://sct.sphene.net/
> > is my wiki based on python-markdown (and django)
>
> I will stick with what I installed, but I do _care_ - it's good to
> have a Wiki based this module. Please add your project to the wiki
> under "Related Projects".
>
> - yuri
>
> --
> http://www.freewisdom.org/
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Python-markdown-discuss mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss
>
--
----
Waylan Limberg
wa...@gm...
|