Alessandro Vesely writes:
> On 24/Nov/10 12:56, Sam Varshavchik wrote:
>> Alessandro Vesely writes:
>>> On 24/Nov/10 00:19, Sam Varshavchik wrote:
>>>> sqwebmail does some aggressive sanitizing in order to block potential
>>>> XSS attempts. I don't have the time to keep track exactly how various
>>>> browsers handle malformed html. If something looks wrong, it gets
>>> I suspect there may be a bug with that nuking, because I saw the '<'s
>>> and '>'s as bare chars rather than &-entities, in the source viewer.
>> That's part of it. When an entire tag is deemed to be suspect,
>> sqwebmaild removes its entire contents, but the < > are left alone.
> Aha, so it isn't a bug. But what is the purpose of leaving those
> relics? I mean, if the sanitizing code could also remove angle
> brackets for the same price, why it doesn't it do so?
It's easier to do it that way, the way that the internal logic works.
Overall, the message is formatted for display "on the fly". The entire
message is not loaded into memory, but formatted as a stream. So, sqwebmail
does not make excessive RAM demands, no matter how big the message is. It's
the browser's problem to figure out how to format a huge HTML document.
An HTML tag gets extracted. The contents of the tag, but not the angle
brackets around it, are passed to a sanitizing function that parses it, and
decided whether to do anything about it. When the sanitization function
returns, the contents of the buffered are output.
If the sanitization function decides to drop the tag, it was more convenient
to simply clear the buffer with spaces. But the surrounding brackets would
still be printed by the function that invokes the sanitization function.