#16 Incorrect regular expressions used

closed
Rogan Dawes
5
2006-01-13
2005-07-21
No

The Proxy plugin and the Manual Request plugin use what
I believe to be an incorrect regular expression:

.*\.(gif)|(jpg)|(css)|(js)$

This expression does not always match graphic files and
I've often found myself accepting manual requests for
these files when setting up the proxy in interception
mode.

I suggest that the following regular expression should
be used instead:

.*\.(gif|jpg|png|css|js|ico|swf)$

It is a valid regular expression (matches graphic files
above) and also includes some files that are usually
available in web files but that might not make sense to
intercept in the proxy:

- .ico files: grab by most browsers (favicon.ico) to
represent an image of the web server visited

- .swf files: Flash plugin files, since many sites use
this for either ads or some dynamic content (I use the
flashblock plugin in Mozilla to prevent loading this
automatically, however)

Javier

Discussion

  • Logged In: YES
    user_id=404932

    Missed to mention that the new filter also adds png as a new
    image file format. Maybe "jpeg" could be added as an
    extension there too although it's not that common these days.

    Based on google searches in images.google.com:
    - 6,940,000 jpg images
    - 7,160,000 gif images
    - 4,890,000 jpeg images
    - 4,750,000 png images

     
  • Rogan Dawes
    Rogan Dawes
    2005-07-21

    Logged In: YES
    user_id=438260

    Hi Javier,

    thanks for the comment.

    I have had similar suggestions in the past, and had thought
    that I had fixed this already. Of course, since my own
    personal WebScarab.properties file already has something
    similar in it, and that overrides the application default,
    it is easy to miss. Will try to remember to fix this the
    next time I make some changes.

     
  • Rogan Dawes
    Rogan Dawes
    2006-01-13

    • status: open --> closed
     
  • Rogan Dawes
    Rogan Dawes
    2006-01-13

    Logged In: YES
    user_id=438260

    The default regex has been changed.