Usage
Crawlzilla has 2 mode management:
- Dialog management: It offers low-level nodes management, such as: (1)check cluster status (2)datanode&tasktracker node management (3)datanode&taskjacker management (4)tomcat management (5)change tomcat port.
- Web interface management: It offers (1)crawl setup (2)search engine management (3)index pool management.
Crawlzilla usage procedure:

1. First Usage
$ /home/crawler/crawlzilla/system/crawlzilla
1.2 Check cluster status

1.3 Setup cluster
Enables all nodes to run datanode & taskjacker.


1.4 Check all nodes datanode & taskjacker status

2. Crawlzilla Web Usage
When you first login web interface, it need to change administrator password.
2.2 Crawl setup

Go to the "crawl page" and input 3 parameters:
- Index Pool name: To identify this search engine and index pool
- Crawl URLs: input which URLs you want to crawl (ex. https://sourceforge.net/p/crawlzilla/wiki)
- Crawl depth: choose depth for these URLs
2.3 Check crawl status

2.4 Use search engine
2.5 Embed search engine to other page