NLP WebCrawler
==============
Description
-----------
A WebCrawler for Natural Language Processing. This WebCrawler searches for monolingual (in a specified language) and bilingual, parallel text. The WebCrawler was adapted from a project by Hatem Mostafa (http://www.codeproject.com/cs/internet/Crawler.asp).
The n-gram files, used for language identification, are not included in the distribution, but are available on request (http://www.ctext.co.za/).
Requires
--------
* IKVM.NET http://www.ikvm.net/
* PDFBox - Java PDF Library (http://www.pdfbox.org/)