///////////////-------------------------------------------------\\\\\\\\\\\\\\\\
<<<<<<<<<<<<<< ConNeCTOR >>>>>>>>>>>>>>>
\\\\\\\\\\\\\\\-------------------------------------------------////////////////
ConNeCTOR - Convenient Network Characteristics Testing Organized Routines. With
WIRE-Nic this software is used analyze websites characteristics like IPv6
adoption, GeoLocalization, HTML standards adherence, NTP synchronizaton, among
others.
This software is used in the project handle by Nic.br called TIC WEB
(http://www.ceptro.br/CEPTRO/MenuCEPTROSPCensoWeb) which tries to study some of
characteristics of the brazilian web.
This project was build modularly, so it would be easier to devolop more tests
and features. It was build in Java but it was tested and used with Ubuntu from
version 10.04. And, as database, it uses a MySQL instance.
In this docs folder there is sql dump of the database structure
The motivation for its creation were to help us to discover some of the
characteristics of a web site. It can obtain/generate four types of tests:
- Load important data from WIRE database to the MySQL database;
- Make sites and server tests like IPv6 adoption and Geolocation;
- Make pages tests like HTML standards adherence;
- Make link tests, trying to discover more about the objects that are
being used over the web
In the release 1.0 the tests that it can perform are:
- Load data from WIRE;
It can load csv files of pages, sites and language that can be exported
from a WIRE execution.
- Site tests
the site tests are:
- Domain identification;
- Response;
- IPv6;
- NTP;
- GeoLocation;
- Page tests
the page tests are:
- HTML standards adherence
- Accessiblity standards adhence
- Links tests
There are two kinds of links tests, one simpler and faster that parses
the links files generated by WIRE and group them by extensions and
destiny domains. And other, slower, called linkscompleto, that
additionally make HTTP Head requests to each URL fount so it can gather
more information about the object, like its size or Http status.
<<<<<<<<<<<<<< Instalation Guide >>>>>>>>>>>>>>>
This instalation guide works for Ubuntu 10.04 LTS 32 bits
Requirements:
Java:
open /etc/apt/sources.list
uncomment the following lines:
# deb http://archive.canonical.com/ubuntu lucid partner
# deb-src http://archive.canonical.com/ubuntu lucid partner
they must be like:
deb http://archive.canonical.com/ubuntu lucid partner
deb-src http://archive.canonical.com/ubuntu lucid partner
$ apt-get update
$ apt-get install sun-java6-jre
during the instalation a confirmation to the use licence will be
requested.
W3C markup validator:
this scprit follows the instalation guide published in:
http://validator.w3.org/docs/install.html
If you are using a 32 bit Linux, you must install Active perl5.8:
(Obs: the Perl that is already installed musn't be removed)
Download the package for your SO version. It must the be the
5.8 version Debian package that can be found at the end of the
page.
http://www.activestate.com/activeperl/downloads
$ dpkg -i <downloaded file>
install xmlto:
$ apt-get install xmlto
install OpenSP, o SGML (and XML) parser:
download last version at:
http://sourceforge.net/projects/openjade/
[unzip the opensp and access the generated folder]
$ ./configure
$ make
$ make install
install perl modules:
$ /opt/ActivePerl-5.8/bin/cpan
Obs: if you are using a 64 bit system, use the command cpan directly
> install YAML
> install Bundle::W3C::Validator
Obs: agree with all confirmation requests
> quit
validator:
Download the validator and DTD at:
http://validator.w3.org/validator.tar.gz
http://validator.w3.org/sgml-lib.tar.gz
$ tar zxvf validator.tar.gz
$ tar zxvf sgml-lib.tar.gz
$ mkdir /usr/local/validator
$ cd validator-1.1
$ mv htdocs share httpd/cgi-bin /usr/local/validator
configuring validator
$ mkdir /etc/w3c
$ cp /usr/local/validator/htdocs/config/* /etc/w3c
$ nano /etc/w3c/validator.conf
edit the option: Allow Private IPs = yes
and save the file
$ nano /etc/local/validator/cgi-bin/check
edit the first line of the file
where it is: #!/usr/bin/perl -T
put: #!/opt/ActivePerl-5.8/bin/perl -T
save and close the file
to test, use the following command
$ /usr/local/validator/cgi-bin/check uri=http://www.w3.org/
Configuring the Web server:
$ apt-get install apache2
from the validator unziped folder,
$ cp httpd/conf/httpd.conf /etc/apache2/sites-enabled/
other configurations can be found at the official validator
instalation site
$ /etc/init.d/apache2 restart
test the validator accessing: http://localhost/w3c-validator/
ntpdate:
$ apt-get install ntpdate
$ apt-get install ntp
after installing ntp, create a file with the name "ntp.drift"
# touch /etc/ntp.drift
If the time error of your machine is bigger than 16 mim the ntp may not
work. If that is the case, you may syncronize it manually, before
initializing the ntpd. Or by running the following command:
# ntpd -q -g
Substitute the content of the ntp configuration "/etc/ntp.conf" file to:
# "memoria" para o escorregamento de frequencia do micro
# pode ser necessario criar esse arquivo manualmente com
# o comando touch ntp.drift
driftfile /etc/ntp.drift
# estatisticas do ntp que permitem verificar o historico
# de funcionamento e gerar graficos
statsdir /var/log/ntpstats/
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
# servidores publicos do projeto ntp.br
server a.ntp.br iburst
server b.ntp.br iburst
server c.ntp.br iburst
# outros servidores
# server outro-servidor.dominio.br iburst
# configuracoes de restricao de acesso
restrict default kod notrap nomodify nopeer
After that, you need to restart the ntpd
$ /etc/init.d/ntpd restart
GeoIP:
$ mkdir -p /usr/local/share/GeoIP
$ nano /usr/local/share/GeoIP/atualizaGeoIP
insert the following code into the file:
#!/bin/bash
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games"
wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz
gunzip GeoIP.dat.gz
mv GeoIP.dat /usr/local/share/GeoIP/
save and close the file
$ chmod +x /usr/local/share/GeoIP/atualizaGeoIP
$ /usr/local/share/GeoIP/atualizaGeoIP
$ crontab -e
insert the line: 0 0 10 * * sh /usr/local/share/GeoIP/atualizaGeoIP
save and close
MySQL:
$ apt-get install mysql-server mysql-client
In order to ConNeCTOR work you must create a database with the schema
given by this project
the command to upload the schema is:
$ mysql -u user -p [database name] < DatabaseStructureV4.sql
this file can be downloaded from the docs folder on sourceforge's
project site.
IPv6:
$ ifconfig eth0 add <ipv6 address>
$ ip route add ::/0 via <gateway ipv6 address>
test the connection
and/or
$ nano /etc/network/interfaces
add the IPv6 configuration:
iface eth0 inet6 static
pre-up modprobe ipv6
address <ipv6 address>
netmask 64
gateway <gateway ipv6address>
completed instalation!
<<<<<<<<<<<<<< Execution >>>>>>>>>>>>>>>
the realeases are .jar files that can be executed with a "java -jar" command.
But it's recommended that you use at least 1GB of maximum and start heap space
(e.g. -Xmx1024M -Xms1024M)
and, the parameters format are the following:
[retest] [maxThreads <Amount of threads to be used>]
[database <database url e.g. //localhost:3306/teste>]
[user <database login >] [password <database password>]
[dontExcludeSites] (consolidate <file + extension to where it will be exported>|
(consolidatebr <file + extention to where it will be exported>) | ipv6 |
resposta | geolocal | dominio | acessibilidade | html | csvpage | csvhost |
csvlang | links | xml | ntp | linkscompleto | downhome | htmloffline )
[file or folder path]
below there is a list of execution examples:
alias connector java -Xms1024M -Xmx1024M -jar connector-1.x.jar database //localhost:3306/teste password yourpw user you
connector retest maxThreads 400 ipv6
connector retest maxThreads 1000 resposta
connector.sh retest maxThreads 1000 geolocal
connector retest maxThreads 2000 dominio
connector retest maxThreads 400 ntp
connector retest maxThreads 60 html sites
it's important to say that before running these tests, the database must be set
and configured and a list of sites must be load into the Sitios table. You can
upload the list directly in mysql or you can use de option 'csvhost' of
connector to upload data from a csv sites list exported from WIRE-Nic. The
command should look like this:
connector retest maxThreads 400 csvhost sites.csv
there is a sample of how this file must look in the docs folder.
Also, the tests that validates pages like 'html' and 'acessebilidade' are
executed over the 'sites' folder that is generated by the execution of WIRE-Nic
and so the path to this folder must be specified as a parmeter like in the html
example above.