-
Hi, is there any way in Lemur API, or any ideas to eliminate javascript code from text snippets in retrieval?.
thanks in advance.
2009-11-24 19:52:30 UTC in The Lemur Toolkit
-
Hi, David.. about your comment here that is not recommendable to use trecweb format when building an Indri Index.
I have all my input documents in trecweb, in separated files, and all in UTF-8. This eliminates the possible encoding mismatch you mention here? or should I do something else to avoid that mismatch?.
Thanks.
2009-11-04 02:38:35 UTC in The Lemur Toolkit
-
Hi, here is a link with information on how to specify retrieval parameters and how to format your queries
[IndriRunQuery](http://sourceforge.net/apps/trac/lemur/wiki/RetEval%20and%20IndriRunQuery)
regards.
raul.
2009-09-24 17:31:14 UTC in The Lemur Toolkit
-
I forgot that I posted this before..
Anybody knows how to do this?.
2009-09-05 19:45:23 UTC in The Lemur Toolkit
-
Hi,
Im testing retrieval on an Indri Index of ranked (with pagerank tool) documents. And Im comparing the results with the same Index but with out the priors added to it.
The first difference i note, is that in the serch results of the ranked index, theres is a lot of documents from the same domain together, which doesn't happen with the no ranked index.
Does any body knows how to...
2009-09-05 03:16:53 UTC in The Lemur Toolkit
-
Hi,
it is possible to use the harvestlinks tool using an indri index as source instead of the corpus path with the documents to index?
Im using pagerank with an indri index as source and I want to know if I can do the same with the harvestlink app.
Thanks
-raul-.
2009-09-04 03:35:50 UTC in The Lemur Toolkit
-
you're completely right! there was a problem with the trecweb format, it seems like it needs always to have a line break after each tag ( <DOC>, <DOCNO> , etc)
Thanks.
2009-08-14 00:46:48 UTC in The Lemur Toolkit
-
hello, any hints?.
2009-08-12 17:00:00 UTC in The Lemur Toolkit
-
Hi,
I'm trying to use harvestlinks app with a trecweb collection of 1580 separated files. The command line I'm using is :
harvestlinks -corpus=./docs_test -output=./hrv_test
but i get this error:
0:06 Phase 4: Combining harvested links to final output...
Error opening sorted link file './hrv_test/harvest/linkFile.sorted'...
2009-08-07 18:51:00 UTC in The Lemur Toolkit
-
Hi,
Does anybody know if its possible to get the html
metadata (i.e. "<meta var="content">) from the API ?
I know you can index metadata embbeded in the TREC format specifying it in
the parameters xml file like this:
<metadata><field>fieldname</field></metadata>
and you can get it back using the API, but this
only works when the...
2009-07-14 18:50:19 UTC in The Lemur Toolkit