Menu

#1 make parser discard STYLE tag contents?

open
nobody
None
5
2012-09-14
2012-07-24
No

Hi, just came across this project, and it looks like a fabulous fit for converting email to plaintext for an app I'm working on. One thing I'd like to do, however, is to make it discard the contents of embedded style tags (eg: emails from pinterest, etc), rather than displaying them as a text nodes. Is this something it already supports? If not, would it be difficult to add, any pointers to where in the code to look to add this functionality?
Secondly, I'd like to create a second html formatter that rather than converting to plaintext, only filters out a list of specific "undesirable" tags (eg: img, script, object, head. etc.) and then leave it as html. Any pointers on what to modify to do this?
(It looks like there are some constants at the top that define what types of elements are getting filtered/allowed. I'm guessing I'd have to have a method to change which element types are allowed depending on what level of filtering I want? Perhaps deeper changes?)

Discussion


Log in to post a comment.