From: Waylan L. <wa...@gm...> - 2007-11-05 05:23:55
|
I've just committed a patch to svn (r53) that provides a nice middle ground to the escaping vs. removing html issue. The old behavior is still the default, but escaping is provided as an option. Currently, the global variable `HTML_REMOVED_TEXT` holds the text that is used for replacement. I set it up so that if that string is empty (or otherwise evaluates to `False` in python) then the html is escaped instead. In other words, you turn escaping on in the same way that you change the replacement text. Here's an example: >>> import markdown >>> markdown.HTML_REMOVED_TEXT = '' >>> md = markdown.Markdown(safe_mode=True) >>> md.convert('<a href="foo">foo</a> bar.') '<p><a href="foo">foo</a> bar.\n</p>' I left the default as the old behavior, but that could easily be switched. I also considered adding a new global (perhaps `ESCAPE_HTML`) which would simply hold a True/False value, but couldn't see adding an additional variable. If anyone feels otherwise, let me know. I see one potential problem with my solution which I hadn't considered until just now (after committing my patch). One could already have code that sets `HTML_REMOVED_TEXT` to an empty string so that all html is stripped and replaced with nothing. Some may prefer such a behavior. This makes that imposable to do. Is anyone doing this? Adding `ESCAPE_HTML` would address this issue, if it is one. Another solution would be to change the expected values of the `safe_mode` parameter for Markdown() to one of 'strip', 'escape', or None rather than True/False. But that could get complicated/confusing. Oh, and obviously, the value of `HTML_REMOVED_TEXT` can be changed in the source file if one will always want that behavior. That can become a headache on upgrading to a new version though. Its usually better to future-proof your code IMO. I should also mention that I also moved the code that does the escaping/removing from the convert method to a text-post-processor. It makes more sense there regardless of this change IMO and simplifies the process of making your own extension to change the behavior. Extensions would be another way to address the issues I mention above. Perhaps we could just leave it at that. The escaping is very basic. Any improvements are welcome. Anyone know of a method already available in the python standard lib? Any objections, comments, suggestions are welcome. On 6/12/07, Yuri Takhteyev <qar...@gm...> wrote: > You should be able to do this with a preprocessor by simply > pre-escaping all HTML, no? Alternatively, if you want a quick and > dirty hack, look for the line that says: > > if self.safeMode and html != "<hr />" and html != "<br />": > html = HTML_REMOVED_TEXT > > I do agree though that perhaps escaping html would be a better > default. (Please do file a bug on sourceforge so that I don't forget > to make this change later.) In the long term, perhaps, the new and > more flexible way of managing pre-post-etc-processors would solve this > problem as well. > > > implementation removes HTML . (like why HTML is escaped in code blocks > > and not fully removed) .. > > An oversight on my part... > > > P.S.: @Yuri Takhteyev: i guess you don't really care any more since > > you've already put up a wiki .. but anyway .. http://sct.sphene.net/ > > is my wiki based on python-markdown (and django) > > I will stick with what I installed, but I do _care_ - it's good to > have a Wiki based this module. Please add your project to the wiki > under "Related Projects". > > - yuri > > -- > http://www.freewisdom.org/ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > Python-markdown-discuss mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss > -- ---- Waylan Limberg wa...@gm... |