From: Lee <le...@gm...> - 2009-07-27 12:46:28
|
On 7/26/09, Gilles <cod...@fr...> wrote: > Hello > > It took me a while to figure out that the following regex is greedy: > > #GREEDY > s|(<font color=\#[^>].+?>JohnDoe</font>)|<span class=myclass>$1</span>|gs > > ie. "[^>].+?>" doesn't tell Privoxy to skip all characters until it > finds the first occurence of ">"; Instead, it kept going forth until it > > As a work-around, I used this more restrictive regex: > > #GOOD > s|(<font color=\#.{6}>JohnDoe</font>)|<span class=myclass>$1</span>|gs > > Still, I'm curious to understand why the former doesn't work as > expected (ie. "[^>].+?>" doesn't succeed in having Privoxy skip all > characters until it finds the first occurence of ">"). > > Thank you for any hint. hrmm... after a quick look thru the privoxy documentation I couldn't find it either - but take a look at the start of default.filter: # Note2: In addition to the Perl options gimsx, the following nonstandard # options are supported: # # 'U' turns the default to ungreedy matching. Add ? to quantifiers to # switch back to greedy. so change the options from "gs" to "gsU" to get ungreedy matching. eg: s@<iframe\s+[^>]* title="Advertisement" [^>]*>.*</iframe>@\ <!-- Privoxy removed Advertisement iframe -->@gisU Regards, Lee |