infoRetSearchEngine Code
Status: Pre-Alpha
Brought to you by:
cnishanth
ZODB should be installed for version python 2.4 for this program to run ZODB is available at www.zope.org/Wikis/ZODB To Index the newsgroup items in say folder 20newsGroup run 'python Indexer.py True 20newsGroup' To search a query on the index run 'python Search.py "<query>" ' To compute the top <n> words run 'python computeStatistics.py <n> Optional Global parameters in Globals.py 1. doStopListCheck - True/False Instructs indexer if the stopWordsList is to be used or not 2. doStemming - True/False Instructs indexer if stemming should be performed 3. generateDocumentVectors - True/False If Vector based ranking is required or not 4. doClustering - True/False If Clustering should be performed at the end or not 5. noRankedDocs - integer Number of documents to be returned after vector retrieval. This would also be the number of documents that will be clustered 6. clusterCount - integer Number of clusters to start with in the k-means algorithm 7. doBlindRelevanceFeedback - True/False If BlindRelevanceFeedback should be used and the query regenerated