Re: [Pas-dev] perldoc perlre
Status: Beta
Brought to you by:
mortis
|
From: Kyle R . B. <mo...@vo...> - 2002-05-20 18:19:28
|
> > foreach ... {
> > $param =~ s/<\s*SCRIPT\b/<SCRIPT/ge;
> > $param =~ s/<\s*CODE\b/<CODE/ge;
> > $param =~ s/<\s*APPLET\b/<APPLET/ge;
> > }
> >
> > and just forget the trailing '>'? Or is that necessary? How to escape
> > it without potentialy manipulating data we're not supposed to is a
> > delicate issue.
>
> not getting the trailing '>' my lead to a problem with the display of the page.
> also don't forget to get the '</SCRIPT>', etc. out of the code.
>
> hmm.. not translating all '>' or '<' can lead to some interesting pages. i
> could start posting in '</td>' or '</input>' or '</form>'. they're not viscious
> but, going back to the discussion board example, you could feasibly break a
> thread in a discussion board by passing some simple HTML.
Yes, that's right. For discussion boards, we could have a re-encode function
that re-enabled alot of the 'safe' stuff. To go backwards for 'safe' tags,
we could just have:
foreach ... {
foreach my $tag ( qw( a img p br pre code font ) ) {
$param =~ s#<($tag.+?)>#<$1>#gi;
$param =~ s#<(/$tag.+?)>#<$1>#gi;
}
...
}
Gak, you always hear people say that you can't swat HTML with regexes...and
they say that for good reason. The proper appraoch is to use a fully blown
HTML parser...but that's really overly heavy-weight for this application...
Discussion boards might want to turn off the blanket scrubber, and use
their own, _or_ un-scrub the stuff they want to allow...I don't know...if
we write one that scrubs appropriatly for the discussion board example,
maybe we should just make that the default...
k
--
------------------------------------------------------------------------------
Wisdom and Compassion are inseparable.
-- Christmas Humphreys
mo...@vo... http://www.voicenet.com/~mortis
------------------------------------------------------------------------------
|