From: Paul L. <pa...@sq...> - 2020-08-07 03:07:52
|
On Thu, July 23, 2020 8:14 pm, Alexey Shpakovsky wrote: > On Thu, July 23, 2020 18:16, Paul Lesniewski wrote: >> >> I'm not sure, but SquirrelMail does not have code that fully converts a >> HTML document into plain text. In many cases the HTML document can be >> quite complex and the resulting plaintext may be somewhat unreliable. >> To >> be clear, the default is for SM to never show tracking pixels or other >> unsafe, remote-loaded components in the HTML. > > I believe SquirrelMail does have such code - in functions/mime.php, lines > 399-410 in SquirrelMail version 1.5.2 - yes, it's just 11 lines [1]. From That code does not convert HTML to plain text. It sanitizes HTML. > my experience this code runs when I have option "Show HTML Version by > Default" disabled (so I prefer plaintext), and receive an HTML-only > message (so a message doesn't have a plaintext part). Because in that case SquirrelMail has nothing else to work with so it does the same thing it would do if you got an email with both parts and chose to view the HTML part. > In patch ticket number 496 [2] you can find screenshot how this > HTML-to-text conversion looks like together with my attempt to improve it. > Note that message shown on screenshot was specifically chosen to look > especially bad before my changes and especially good with them. > [2] https://sourceforge.net/p/squirrelmail/patches/496/ It looks bad because you're trying to do something SquirrelMail wasn't designed to do. I also get annoyed with companies that can't be bothered to send a plain text part in their email messages, but I'm fairly certain we don't want to suddenly add a HTML-to-plaintext conversion based on just a few naive regular expressions that could create security issues. The code would have to be in a branch based on a user preference for forcing HTML parts into plaintext and the conversion would be best done with an already proven library (see html2text). |