HIT campus search engine system which clusters your search results into topics. The user will quickly find what you are looking for.​ This system includes four parts: 1. Web crawling 2. HTML parsing 3. Indexing 4. Searching. We use the open source software "Hetrix" as our cralwer, use the Lucene to build the index. In order to quickly find what you are looking for, we use carrot2 to help us cluster the search results into topics. We also write a script to fetch the websites in the campus everyday and update the index automatically.

Project Samples

Project Activity

See All Activity >

License

W3C License

Follow HITSearchEngine

HITSearchEngine Web Site

You Might Also Like
Find out just how much your login box can do for your customer | Auth0 Icon
Find out just how much your login box can do for your customer | Auth0

With over 53 social login options, you can fast-track the signup and login experience for users.

From improving customer experience through seamless sign-on to making MFA as easy as a click of a button – your login box must find the right balance between user convenience, privacy and security.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of HITSearchEngine!

Additional Project Details

Operating Systems

Linux

Languages

English

Programming Language

Java

Database Environment

MySQL

Registered

2012-08-07