Menu

Concurrent retrieval

Galago
2018-03-08
2018-03-14
  • Anton van der Vegt

    I'm using Galago Java libraries to perform retrieval on a single index. Will this support concurrent retrieval? If not, are there any methods/examples you have that supports concurrent retrieval?
    Thank you.

     
  • Lemur Project

    Lemur Project - 2018-03-09

    I don't believe Galago supports concurrent query based retrieval.

    Galago does do distributed retrieval, where indexes on different hosts are serviced by a galago search server on those hosts. One then performs batch-search using the indexes being serviced via URL reference.

    However, it is the index Retrieval objects that are distributed in their own threads (concurrent searches by index), with results merged when all threads complete. Not the queries themselves.

    You could look at the MultiRetrieval.java code for details of this.

    It's possible there is some concurrent, threaded query processing, but I'd have to look closer at code to be more definitive.

     
  • John Foley

    John Foley - 2018-03-09

    The Retrieval interface uses a single thread to score a single query, and the underlying disk indexes are read-only. Therefore, you should be able to run as many queries as you have cores at the same time and have no problem.

    There is also a CLI interface to this that works like regular batch-search.
    galago threaded-batch-search

     
  • Lemur Project

    Lemur Project - 2018-03-09

    Looks like threadCount is the parameter that controls how many query threads one wishes in the threaded-batch-search.

    Do you know off-hand if threaded-batch-search can handle multiple indexes? Just wondering how it might handle distribution of Retrieval objects in their own threads and the distribution of queries also in their own threads.

     
  • John Foley

    John Foley - 2018-03-09

    Assuming all the logic is done correctly in MultiRetrieval, then it should all just work with multiple indexes. There's no state to get confused across queries, so everything should be ok. I forgot that MultiRetrieval allowed threads.

     
  • Anton van der Vegt

    Thank you for your responses.

    To be clear, I am constructing a drop wizard service which utilises Galago as the search engine. Each new request will occur on a new thread, however there is just one single Galago index file. I was just wondering if I will get file locks or other kinds of deadlocks when another thread is performing the search?

    I could of course copy the index file, but then I would have to marshall the requests to the files not in use....I'm not sure how I would do that, and I"m hoping I don't have to!

     

Log in to post a comment.