Re: [Katta-developer] Newbie question
Brought to you by:
joa23
From: Johannes Z. <jzi...@go...> - 2011-09-26 08:28:01
|
Hey Giorgos, i'm sorry for the late reply! I'm not sure if i'm fully understand you're question. As you said, the number of hits corresponds with the number of documents in the index. Creating a katta index is in no way other then creating a simple lucene index. Its the same. Just if you want to do it in a distributed way you can use hadoop f.e. and the "SequenceFileCreation" thing is just an example code on how to use hadoop. I suggest in the first step just creating lucene indices in a simple way (without hadoop). Just follow any lucene tutorial (e.g. http://www.lucenetutorial.com/lucene-in-5-minutes.html) and plug the result-index into katta! HTH Johannes On Jun 30, 2011, at 4:48 PM, Giorgos Fragiadoulakis wrote: > > Hi, > > I am trying to learn how to create indexes using Katta. > At the search step, I get incorrect number of hits for a specific word in a .txt file (e.g. alice.txt). > As I understand, that depends on the number of records defined on the "Sequence File Creation" step, e.g. 1000. > For 1000 records, I get 86 hits for the word "Alice". As I increase this number, e.g. 5000, the number of hits is increased. > I examine the output indexes using Luke and as I can see, the hits number corresponds to the number of documents in which the "query" is included, > although some documents are duplicated. > I checked the "SequenceFileCreator.java" code, but I was not able to find a solution. > > Any response would be welcome. > > Greetings, > Giorgos > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2_______________________________________________ > Katta-developer mailing list > Kat...@li... > https://lists.sourceforge.net/lists/listinfo/katta-developer |