Read Me
You are strongly advised to familiarise yourself with
the content of sspider-setup and sspider-search before
attempting to run the programs. It is anticipated that
several searches will be needed to return your needed
data and you should note that it could take in excess of a
hour for a simple query to run, depending on the extent
of your search. Bigger queries will take longer.
While simply telling sspider-setup to run should set up the
needed tables for your first search without any attachments
(I hope you like chocolate) you should have provided
a suitable user access and password to your database, and
editied these into sspider-setup and sspider-search.
(The default database is querynet0)
You should also have initialised two tables : hubs and authorities
in the database - an example code is shown below:
CREATE table hubs ( qword varchar(250),
url varchar(250),
relevancy float4,
plref int4,
logit timestamp default now(),
uidx serial PRIMARY KEY );
Edit this and execute to create the authorities table.
Then create the queries table
CREATE TABLE queries (
query character varying(250),
qword character varying(250) PRIMARY KEY,
url_limit integer,
url_count integer,
status character varying(10),
nword character varying(250),
mword character varying(250)
);
Run the setup by executing ./sspider-setup
and confirm that tables: queries, hubs and authorities
are present. Now would be a good time to inspect the
content of these tables. Absent any obvious errors, run
the search by executing ./sspider-search
There will likely be complaints from perl as the code runs,
because HTML pages with utf8 pages are not fully parsed by
this version of the programs.
Eventually, the program should print out twenty html
links to hubs and twenty html links to authorities.