Home
Name Modified Size InfoDownloads / Week
leopdo-2012 2012-05-14
src 2011-07-01
licence.txt 2011-07-04 10 Bytes
leopdo.sql.rar 2011-07-04 130.8 kB
readme_en.txt 2011-07-04 3.6 kB
leopdo.war 2011-07-01 49.6 MB
Totals: 6 Items   49.7 MB 0
Leopdo£¨beta£© Search Engine(2011)

A web search engine and crawler written in java, including full-text and vertical search,  word segmentation system .

/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
1. install: (JDK6+TOMCAT6.0+MYSQL5.5 and above)

1) install mysql(port : 3306, user/pwd : root/123456)
   install mysql gui administrator

2) import database : leopdo.sql

3) install tomcat(port : 80)

4) copy leopdo.war to webapp\

5) run tomcat


////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
2. start search engine£ºopen explorer(IE or Firefox) and input such urls below to implement the tasks in order

1)
http://localhost/leopdo/bot/task/Com_websync.do?task=domain&url=http://www.hao123.com&batch=2&batchHandle=1
Retrieve a website's 2(batch=2) dimmention pages(from home page to the next level pages( which links in home page), and the second level pages),
and save in database. The website is a navigation website like http://dir.yahoo.com or http://www.hao123.com


2)
http://localhost/leopdo/bot/task/Com_websync.do?task=digdomain&url=http://www.hao123.com&batch=2&update=-1&batch2=0&dimstart=0
Retrieve the homepage of the websites which collected in the navigation website(http://www.hao123.com)


3)
http://localhost/leopdo/bot/task/Com_alldomaintask.do?tasktype=html&batch=1&update=-1&dimstart=0&sectionId=1531
Read the homepages from database and Retrieve these websites's 2 dimmention pages, if section=1531, read the homepages of sectionId=1531


4)
http://localhost/leopdo/bot/task/Com_alldomaintask.do?tasktype=key&batch=1&titleOnly=2&kupdate=-1&dimstart=0&sectionId=null
Read the pages of the websites from database and generate the keywords


5)
full-text search test£º 
http://localhost/leopdo/search.html, input the keyword


//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
3 Other application based on leopdo search engine:

build vertical search engine (news, music, book, shopping etc)

1)
select * from leopdo.thing where source1 = -1 and rec_create_location = 'hao123.com',  
find the record id which description='news', and this record id(such as 1531) is sectionId

2)
http://localhost/leopdo/bot/task/Com_alldomaintask.do?tasktype=html&batch=1&titleOnly=2&kupdate=-1&update=-1&dimstart=0&sectionId=1531
Read the pages of the websites which sectionId=1531 from database, if update=1, update the old pages

3)
http://localhost/leopdo/bot/task/Com_alldomaintask.do?tasktype=key&batch=1&titleOnly=2&kupdate=-1&update=-1&dimstart=0&sectionId=1531
Read the pages of the websites which sectionId=1531 from database, generate the keywords

4)
delete from leopdo.nthing
remove all the record of the news table

5)
http://localhost/leopdo/searcher/search.do?type=updatenews&date1=2011-06-01&date2=2011-06-02
read the news data from 2011-06-01 to 2011-06-02, sort the records and then save in news table

6)
browse the latest news: 
http://localhost/leopdo/searcher/search.do?type=news


////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Known issue: java http timeout, http connection timeout

implement the urls below to continue the task:

http://localhost/leopdo/bot/task/Com_clearpool.do?flag=1

http://localhost/leopdo/bot/task/Com_checkq.do

Source: readme_en.txt, updated 2011-07-04