Menu

Tree [r3] /
 History

HTTPS access


File Date Author Commit
 cache 2010-01-19 davidredwaratah [r1] Inital load
 filters 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 logrotate.d 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 profile 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 .project 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 CHANGELOG.txt 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 FILTER.OPTIONS.txt 2010-01-19 davidredwaratah [r1] Inital load
 INSTALL.txt 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 MANIFEST.in 2010-01-19 davidredwaratah [r1] Inital load
 OPTIONS.txt 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 README.txt 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 adfilter.py 2010-01-19 davidredwaratah [r1] Inital load
 benchmark.txt 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 common.py 2010-01-19 davidredwaratah [r1] Inital load
 contentfilter.py 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 domainfilter.py 2010-01-19 davidredwaratah [r1] Inital load
 gentoorcscript.willow 2010-01-19 davidredwaratah [r1] Inital load
 gpl.txt 2010-01-19 davidredwaratah [r1] Inital load
 pamauth.py 2010-01-19 davidredwaratah [r1] Inital load
 pamexample.willow 2010-01-19 davidredwaratah [r1] Inital load
 pamnoauth.willow 2010-01-19 davidredwaratah [r1] Inital load
 redhatrcscript.willow 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 safesearchfilter.py 2010-01-19 davidredwaratah [r1] Inital load
 setup.py 2010-01-19 davidredwaratah [r1] Inital load
 shockwavefilter.py 2010-01-19 davidredwaratah [r1] Inital load
 slackwarercscript.willow 2010-01-19 davidredwaratah [r1] Inital load
 urlfilter.py 2010-01-19 davidredwaratah [r1] Inital load
 willow.conf 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...
 willow.py 2010-04-28 stewart [r2] Formal release: Willow 4.0.1 - see CHANGELOG.t...

Read Me

WILLOW is a caching, content filtering proxy/http server written in python.




IMPORTANT NOTE:

Willow is designed to interact most compatibly with current HTTP/1.1 web servers. 
Anything designed since 2006 should fit the idea of "current".

As of April 2010, certain servers still use HTTP/1.0, which pose problems for
Willow - namely Apple's software update server (http://swscan.apple.com) and the 
Edna MP3/OGG server, among likely others.  Certain data exchanges may not fully work 
with these legacy servers, and their operators should be challenged to modernize their 
systems for increased efficiency.  
For reference, see http://www2.research.att.com/~bala/papers/h0vh1.html

This backwards compatibility issue seems deeply ingrained in Willow's original design, 
and no efforts are currently planned to fully support HTTP/1.0 servers.

As a workaround, define known PROBLEM HTTP/1.0 sites to bypass the proxy.  This can 
often be done in the client (eg. Firefox and MacOSx proxy settings).




Willow Filtering Proxy

    "He took some of the seed of your land and put it in fertile soil. He
    planted it like a willow by abundant water, and it sprouted and became a
    low, spreading vine. Its branches turned toward him, but its roots remained
    under it. So it became a vine and produced branches and put out leafy
    boughs."
    
    -- THE Bible, Ezekiel 17:5-6

Willow is a content-filtering proxy server. It bears one similarity to the many
other pieces of software available for web filtering in that it is designed to
filter web content. That, however, is where the similarities end. The
differences between Willow and other solutions are significant, and these
differences make Willow the first really usable internet filter.

    * Expense:

      Willow is available free of charge. This is the complete, full-featured
      version of the software. There are no holdbacks or catches. In addition,
      any improvements or updates will be free of charge.
      
    * Code:

      Willow is open source (under the Gnu Public License). The source code is
      available free of charge to anyone and anyone is allowed to make any
      modification they wish (as long as they also release the source code).
      There are many reasons that we make our software open source. First, it
      makes the user able to customize the code for their own use. If any of our
      software doesn't do exactly what you want, feel free to change it yourself
      - we don't care. In fact, we would encourage you to do so because this
      will make our software better. We try very hard to make our software the
      best that it can possibly be. However, we know that there are many smart
      people out there and the more input that we have on our code, the better
      it is going to be. So, if you decide to have a go at our code, let us
      know. We will incorporate your ideas into our software and distibute the
      improvements to everyone that is using it. 
      
    * Filtering Algorithm:

      Other commercial internet filters sell you a subscription to a list
      containing bad sites. They attempt to keep this list up to date, although
      they don't actually let you see the list. With the massive increase in
      websites over the last several years, this model for web filtering has
      become obsolete. With over 2,000,000,000 sites on the web, it has become
      impossible to categorize all these sites and keep this list sufficiently
      up-to-date to accomplish effective filtering. Willow uses a Bayesian
      algorithm to classify pages on the fly based upon previous pages that have
      already been classified. Willow comes with a set of pages that were
      classified for Woodland Hills School District in Pittsburgh, Pennsylvania.
      Sites that use Willow can start with these pages and add their own sets of
      good and bad pages, or they can start just from scratch with pages that
      they classify. Willow puts the control of the filtering algorithm with the
      users, not with a single corporation.

In addition to being the first web filter to really work, Willow was also
designed to make life easy on network administrators. To this end Willow
supports the following:

    * HTTPS tunneling 
    * response caching 
    * filtering based on any part of the request or response (domain, url, headers, etc.) 
    * through-the-web management 
    * authentication to a Windows NT/2000 domain 
    * authentication through unix password files


WARNING: The Bayesian Filtering algorithm the filter uses allows it to determine
whether or not an item is "okay" or "bad" based upon previous "okay" and "bad"
content that is has seen. To support the filter working "out of the box" there
is "okay" and "bad" content in the download files. The "bad" content is
pornography. While the software will not show you the "bad" content, it is
possible to browse through the directory it is in (since it isn't encrypted in
any way). If you are offended by this or it is illegal for you to view this
please do not download the files.