From: Steven J. S. <sj...@Ju...> - 2002-07-09 02:08:07
|
https://yourwebmail.com/ is an alpha version of a high-powered webmail client I'm developing. You can't compose mail using it yet, but you can definitely read mail, both plaintext and HTML. Right now, the only thing I do to minimize the potential for problems is to do a regexp search/replace on the HTML before rendering it; turning <SCRIPT> into <YWMALIAS> and </SCRIPT> into </YWMALIAS>. I'm still not sure exactly which direction this project is taking, but perhaps we can discuss input filtering for the webmail app? I'd be happy to give y'all shell logins and e-mail boxes here so you can check out the code, if interested. Platform is Linux/Apache/Cyrus (POP3/IMAP4 daemon and mailbox store)/Exim (MTA)/MySQL/PHP. LDAP will eventually be used for online address book storage and user authentication. -- Steve Sobol, CTO JustThe.net LLC, Mentor On The Lake, OH 888.480.4NET - I do my best work with one of my cockatiels sitting on each shoulder - 6/4/02:A USA TODAY poll found that 80% of Catholics advocated a zero-tolerance stance towards abusive priests. The fact that 20% didn't, scares me... |
From: Gabriel L. <ga...@bu...> - 2002-07-09 03:50:56
|
Steve, You're going to find that there is a whole lot more that is evil then just script tags... What I'd suggest you do is instead parse for occurances of <> and only allow things to appear in tags that you have a good list... ie, let i, b, p, br appear, but don't let img, script... you may want to go futher and define some semantics for main tag (p, b, i, br) vs things like onLoad and such that are javascript... You can then also allow only certain sub tags to appear as well. -gabe On Mon, 2002-07-08 at 19:08, Steven J. Sobol wrote: > > https://yourwebmail.com/ is an alpha version of a high-powered > webmail client I'm developing. You can't compose mail using it yet, > but you can definitely read mail, both plaintext and HTML. > > Right now, the only thing I do to minimize the potential for problems is > to do a regexp search/replace on the HTML before rendering it; turning > <SCRIPT> into <YWMALIAS> and </SCRIPT> into </YWMALIAS>. > > I'm still not sure exactly which direction this project is taking, but > perhaps we can discuss input filtering for the webmail app? > > I'd be happy to give y'all shell logins and e-mail boxes here so you > can check out the code, if interested. Platform is Linux/Apache/Cyrus > (POP3/IMAP4 daemon and mailbox store)/Exim (MTA)/MySQL/PHP. LDAP will > eventually be used for online address book storage and user > authentication. > > -- > Steve Sobol, CTO JustThe.net LLC, Mentor On The Lake, OH 888.480.4NET > - I do my best work with one of my cockatiels sitting on each shoulder - > 6/4/02:A USA TODAY poll found that 80% of Catholics advocated a zero-tolerance > stance towards abusive priests. The fact that 20% didn't, scares me... > > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Oh, it's good to be a geek. > http://thinkgeek.com/sf > _______________________________________________ > Owasp-input-api-developers mailing list > Owa...@li... > https://lists.sourceforge.net/lists/listinfo/owasp-input-api-developers |
From: Steven J. S. <sj...@Ju...> - 2002-07-09 12:41:08
|
On 8 Jul 2002, Gabriel Lawrence wrote: > Steve, > > You're going to find that there is a whole lot more that is evil then > just script tags... What I'd suggest you do is instead parse for > occurances of <> and only allow things to appear in tags that you have > a good list... Right. That's what I'm planning on doing. :) I have to figure out the easiest way to do it using PHP and regular expressions. What I have done so far is just a stopgap for a few days until I can continue working on the site. -- Steve Sobol, CTO JustThe.net LLC, Mentor On The Lake, OH 888.480.4NET - I do my best work with one of my cockatiels sitting on each shoulder - 6/4/02:A USA TODAY poll found that 80% of Catholics advocated a zero-tolerance stance towards abusive priests. The fact that 20% didn't, scares me... |
From: Gabriel L. <ga...@bu...> - 2002-07-09 16:03:48
|
When I did a similar thing for a previous project we benchmarked writing our own specialized parser to find <> and manage what can be in a tag vs using regular expressions and found a dramatic improvement to using the non regular expression version. This was in Java code, so it could have been that the regular expression library we were using was not the best, but that may be something to consider also. As a side note I think this kind of functionality would be something great to put into the filters project.... -gabe On Tue, 2002-07-09 at 05:41, Steven J. Sobol wrote: > On 8 Jul 2002, Gabriel Lawrence wrote: > > > Steve, > > > > You're going to find that there is a whole lot more that is evil then > > just script tags... What I'd suggest you do is instead parse for > > occurances of <> and only allow things to appear in tags that you have > > a good list... > > Right. That's what I'm planning on doing. :) I have to figure out the > easiest way to do it using PHP and regular expressions. > > What I have done so far is just a stopgap for a few days until I can > continue working on the site. > > -- > Steve Sobol, CTO JustThe.net LLC, Mentor On The Lake, OH 888.480.4NET > - I do my best work with one of my cockatiels sitting on each shoulder - > 6/4/02:A USA TODAY poll found that 80% of Catholics advocated a zero-tolerance > stance towards abusive priests. The fact that 20% didn't, scares me... > > |
From: vertigo <ve...@pa...> - 2002-07-09 16:58:02
|
Yes, but it is also one of the more complicated regions (a dark, shadowy corner) of the project. Regular expressions are not, as mentioned, the best way to parse HTML on a large scale. It can get way out of control. An actual parser is better for a number of reasons. The one issue I have is the magnitude of writing an HTML parser. It isn't simple, especially when considering poorly written HTML. Now, from a programmer's perspective I think "dammit, why don't people write correct HTML?" From an customer's perspective, however, I think "We decided to write this site to be used only with Internet Explorer. 75% of the people out there use IE, most of the remaining 25% have IE available on their computer, and the rest we don't care about. The browser wars are over and Microsoft won. IE renders this code fine. Why can't your Filter handle it? I'm sure as hell not going to pay several thousand dollars to have that idiot coder come back and rewrite everything." Remember, we have to catch everything that explorer THINKS is valid HTML. Explorer thinks the following is valid code: <html head> <title> microsoft has a very robust parser.</title> </head> <script> function f() { var x = 10 alert("I can't believe IE handles this." + x) } </script <body> IE is great when it comes to parsing HTML, much to the chagrine of many programmers. <br> <br> <input type="mutton" value="amazing" onClick="f()"> </body </html> Put into the project perspective, we have to write HTML parsers for each implementation, and this can be much more complicated than it first appears. We might not want to have limited support in the first release, and then improve it later. Cross-site scripting is a huge issue, and deserves to be handled in great detail. nathan On 9 Jul 2002, Gabriel Lawrence wrote: > When I did a similar thing for a previous project we benchmarked writing > our own specialized parser to find <> and manage what can be in a tag vs > using regular expressions and found a dramatic improvement to using the > non regular expression version. This was in Java code, so it could have > been that the regular expression library we were using was not the best, > but that may be something to consider also. > > As a side note I think this kind of functionality would be something > great to put into the filters project.... > > -gabe > > On Tue, 2002-07-09 at 05:41, Steven J. Sobol wrote: > > On 8 Jul 2002, Gabriel Lawrence wrote: > > > > > Steve, > > > > > > You're going to find that there is a whole lot more that is evil then > > > just script tags... What I'd suggest you do is instead parse for > > > occurances of <> and only allow things to appear in tags that you have > > > a good list... > > > > Right. That's what I'm planning on doing. :) I have to figure out the > > easiest way to do it using PHP and regular expressions. > > > > What I have done so far is just a stopgap for a few days until I can > > continue working on the site. > > > > -- > > Steve Sobol, CTO JustThe.net LLC, Mentor On The Lake, OH 888.480.4NET > > - I do my best work with one of my cockatiels sitting on each shoulder - > > 6/4/02:A USA TODAY poll found that 80% of Catholics advocated a zero-tolerance > > stance towards abusive priests. The fact that 20% didn't, scares me... > > > > > > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Stuff, things, and much much more. > http://thinkgeek.com/sf > _______________________________________________ > Owasp-input-api-developers mailing list > Owa...@li... > https://lists.sourceforge.net/lists/listinfo/owasp-input-api-developers > |
From: Alex R. <al...@se...> - 2002-07-09 17:13:33
|
vertigo wrote: > Yes, but it is also one of the more complicated regions (a dark, shadowy > corner) of the project. Regular expressions are not, as mentioned, the > best way to parse HTML on a large scale. It can get way out of control. > An actual parser is better for a number of reasons. I'm not sure that I follow this rationale, given your understanding of the significant costs and marginal benefits this provides (as demonstrated below) > Put into the project perspective, we have to write HTML parsers for each > implementation, and this can be much more complicated than it first > appears. We might not want to have limited support in the first release, > and then improve it later. What for? Why would we _ever_ need to do such a thing? You trust some intput, you don't trust some other input. If something is tainted, then strip out all semblence of <script> tags. We don't have to handle badly nested tag sets, etc... we just have to canonicalize the data then clobber the beginning tag, end of story. > Cross-site scripting is a huge issue, and > deserves to be handled in great detail. agreed, I'm just not quite so sure it's as hard a problem as you're making it out to be. -- Alex Russell al...@Se... al...@ne... |
From: Gabriel L. <ga...@bu...> - 2002-07-09 19:07:48
|
On Tue, 2002-07-09 at 10:13, Alex Russell wrote: > What for? Why would we _ever_ need to do such a thing? You trust some > intput, you don't trust some other input. If something is tainted, then > strip out all semblence of <script> tags. We don't have to handle badly > nested tag sets, etc... we just have to canonicalize the data then > clobber the beginning tag, end of story. Problem is there are a bunch of places besides script tags that can hold scripts.... So to do this right you really do need to be able to parse things. Now, I agree with Alex, that we don't have to be as friendly as Nathan suggests. That example HTML is really totally busted... I think Alex is right, that we should try and canonicalize it into what we think is good html (and take a conservative approach) so that we can protect from different kinds of attacks. > > > Cross-site scripting is a huge issue, and > > deserves to be handled in great detail. > > agreed, I'm just not quite so sure it's as hard a problem as you're > making it out to be. well, yes and no. It really is a hard problem, because of all the strange places scripting can show up. -=gabe |
From: Alex R. <al...@ne...> - 2002-07-09 20:04:56
|
Gabriel Lawrence wrote: > Problem is there are a bunch of places besides script tags that can hold > scripts.... But there aren't that many: <script> tags javascript: pseudo protocol <object> tags (and whatever the NN equiv was) on* event handlers Am I missing any? If you block recognition by the browser of these execution contexts, then you've won. You don't have to do any strange HTML parsing, just straightforward regexp work. > So to do this right you really do need to be able to parse > things. Now, I agree with Alex, that we don't have to be as friendly as > Nathan suggests. That example HTML is really totally busted... I think > Alex is right, that we should try and canonicalize it into what we think > is good html (and take a conservative approach) so that we can protect > from different kinds of attacks. I don't think we even need to worry about tag closure. If the browser can't detect the beginning of a script block, then we shouldn't worry about it. Simply make sure things are in the right charset. > >>>Cross-site scripting is a huge issue, and >>>deserves to be handled in great detail. >> >>agreed, I'm just not quite so sure it's as hard a problem as you're >>making it out to be. > > well, yes and no. It really is a hard problem, because of all the > strange places scripting can show up. I don't think it's that many, nor are they strange. Or am I missing something? -- Alex Russell al...@Se... al...@ne... |
From: Alex R. <al...@se...> - 2002-07-09 22:00:19
|
Gabriel Lawrence wrote: >>>Problem is there are a bunch of places besides script tags that can hold >>>scripts.... >> >>But there aren't that many: >> <script> tags >> javascript: pseudo protocol >> <object> tags (and whatever the NN equiv was) >> on* event handlers >> > > Don't forget the stuff with CSS. This is where it gets trickey... CSS (by which I assume you mean Cascading Style Sheets) is no way affects Javascript, nor does it create active/scriptable content. It is formatting for document structure. Can it be used malicously? Perhaps, but it is also trivial to filter. It is not nearly as dangerous as JavaScript/ActiveX, nor does it present any threat that "regular" html content does not. >>I don't think it's that many, nor are they strange. Or am I missing >>something? > > Its one of those things that as you dig in deeper and deeper it gets > troublesome. I'm not trying to discurage folks, but jus saying we've got > to do a fair amount of resarch and thinking before you just jump to > regex's. The issue is there are a bunch of overlappping standards. Good > thing is that there are standards and rules so the idea is just to make > sure you account for all the standards that apply and the > overzealousness of the browsers. If you can think of something valid in addition to the things I listed, cool, but I really think that's there's not a lot to it. Cutting XSS off at the knees isn't hard, just tedious. > A quote from one of the leads at netscape "There is no such thing as bad > html" I'm not looking to force valid markup on the world, just to protect them from malicious uses of said markup. that's a simpler task. -- Alex Russell al...@Se... al...@ne... |
From: Gabriel L. <ga...@bu...> - 2002-07-09 22:12:11
|
On Tue, 2002-07-09 at 15:00, Alex Russell wrote: > CSS (by which I assume you mean Cascading Style Sheets) is no way > affects Javascript, nor does it create active/scriptable content. It is > formatting for document structure. Can it be used malicously? Perhaps, > but it is also trivial to filter. It is not nearly as dangerous as > JavaScript/ActiveX, nor does it present any threat that "regular" html > content does not. I'll try and dig it up, but I think there is a way to inject script using cascading style sheets. One of the more recent hotmail problems if I recall correctly... -gabe |
From: Gabriel L. <ga...@bu...> - 2002-07-09 22:29:13
|
Unfortunetly I couldn't find exact details... But here is a general description: http://groups.google.com/groups?q=guninski+hotmail+style+sheets&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&selm=37DEBE41.341844AE%40yahoo.com&rnum=1 Note: iframe is also a tag that seems to cause all kinds of trouble for XSS vunerabilities... -=gabe On Tue, 2002-07-09 at 15:11, Gabriel Lawrence wrote: > On Tue, 2002-07-09 at 15:00, Alex Russell wrote: > > CSS (by which I assume you mean Cascading Style Sheets) is no way > > affects Javascript, nor does it create active/scriptable content. It is > > formatting for document structure. Can it be used malicously? Perhaps, > > but it is also trivial to filter. It is not nearly as dangerous as > > JavaScript/ActiveX, nor does it present any threat that "regular" html > > content does not. > > I'll try and dig it up, but I think there is a way to inject script > using cascading style sheets. One of the more recent hotmail problems if > I recall correctly... > > -gabe > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Stuff, things, and much much more. > http://thinkgeek.com/sf > _______________________________________________ > Owasp-input-api-developers mailing list > Owa...@li... > https://lists.sourceforge.net/lists/listinfo/owasp-input-api-developers |
From: Gabriel L. <ga...@bu...> - 2002-07-10 17:51:52
|
Unfortunetly I couldn't find exact details... But here is a general description: http://groups.google.com/groups?q=guninski+hotmail+style+sheets&hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&selm=37DEBE41.341844AE%40yahoo.com&rnum=1 Note: ifr On Tue, 2002-07-09 at 15:11, Gabriel Lawrence wrote: > On Tue, 2002-07-09 at 15:00, Alex Russell wrote: > > CSS (by which I assume you mean Cascading Style Sheets) is no way > > affects Javascript, nor does it create active/scriptable content. It is > > formatting for document structure. Can it be used malicously? Perhaps, > > but it is also trivial to filter. It is not nearly as dangerous as > > JavaScript/ActiveX, nor does it present any threat that "regular" html > > content does not. > > I'll try and dig it up, but I think there is a way to inject script > using cascading style sheets. One of the more recent hotmail problems if > I recall correctly... > > -gabe > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Stuff, things, and much much more. > http://thinkgeek.com/sf > _______________________________________________ > Owasp-input-api-developers mailing list > Owa...@li... > https://lists.sourceforge.net/lists/listinfo/owasp-input-api-developers |