Re: [Owasp-input-api-developers] I have an immediate application for the filters!

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

vertigo wrote:
> Yes, but it is also one of the more complicated regions (a dark, shadowy
> corner) of the project.  Regular expressions are not, as mentioned, the
> best way to parse HTML on a large scale.  It can get way out of control.
> An actual parser is better for a number of reasons.

I'm not sure that I follow this rationale, given your understanding of 
the significant costs and marginal benefits this provides (as 
demonstrated below)

> Put into the project perspective, we have to write HTML parsers for each
> implementation, and this can be much more complicated than it first
> appears.  We might not want to have limited support in the first release,
> and then improve it later.

What for? Why would we _ever_ need to do such a thing? You trust some 
intput, you don't trust some other input. If something is tainted, then 
strip out all semblence of <script> tags. We don't have to handle badly 
nested tag sets, etc... we just have to canonicalize the data then 
clobber the beginning tag, end of story.

> Cross-site scripting is a huge issue, and
> deserves to be handled in great detail.

agreed, I'm just not quite so sure it's as hard a problem as you're 
making it out to be.

-- 
Alex Russell
al...@Se...
al...@ne...