User Activity

  • Posted a comment on discussion Retrieval on The Lemur Project

    You need a phrase operator to get statistics of -- and then the text operators might need to be extents, e.g.,: Node n = new Node("od", 1); n.addChild(new Node("extents", "natural")); n.addChild(new Node("extents", "language")); In the StructuredQuery/Galago QL language, this would be: #od:1(#extents:natural() #extents:language()) Pseudocode basically, but the thing you want to count is an "ordered window" or phrase of the "positions" called "extents" of your words. The generic "text" operator sometimes...

  • Posted a comment on discussion RankLib on The Lemur Project

    It's a redefinition of the metric under a single swap of ordering. This is used to accelerate some of the learning algorithms (to quickly compare the benefits of different rankings based on which documents they are able to swap). It is a difficult thing to implement and test but is critical to LambdaMART, I believe.

  • Posted a comment on discussion RankLib on The Lemur Project

    Yep, that's what I get for answering without trying.

  • Posted a comment on discussion RankLib on The Lemur Project

    Just by query type -- RankLib's loading is naive about that, it assumes adjacent lines with the same qid are the same query.

  • Posted a comment on discussion RankLib on The Lemur Project

    Doesn't matter, because RankLib ignores it. I usually put a zero, e.g.,: qid:001 0 1:0.5 2:0.7 #docid

  • Posted a comment on discussion RankLib on The Lemur Project

    Threshold candidates are within a feature: it's how many times a feature is allowed to be split -1 says that any difference in floating point values may be used - if you have less than 256 distinct values for a feature, -1 is equivalent.

  • Posted a comment on discussion RankLib on The Lemur Project

    Correct. The key e.g., (#1A) is how RankLib stores the document names.

  • Posted a comment on discussion RankLib on The Lemur Project

    qrel is in trec_eval format, e.g., from the answer below: https://stackoverflow.com/questions/4275825/how-to-evaluate-a-search-retrieval-engine-using-trec-eval

View All

Personal Data

Username:
jjfoley
Joined:
2013-06-14 17:41:26

Projects

This is a list of open source software projects that John Foley is associated with:

  • Project Logo The Lemur Project Search engine and data mining applications and ClueWeb datasets. Last Updated:

Personal Tools