Menu

indexing

soumaya
2012-05-19
2012-08-13
  • soumaya

    soumaya - 2012-05-19

    why the indexed files why aren't in format .txt for example why in format .directed.bf .inverted.bf etc

    with IRtoolkit can't we see the indexed file in format docId freqDoc freqCorpus poids Token

    Best Regards

     
  • Duy Dinh

    Duy Dinh - 2012-05-19

    Hi,
    In general, the index structures are in BINARY format and not in TXT format.
    This has many advantages for storing and accessing the index.
    We could not see the format of the index with text editor.

    Best,
    Duy

     
    • didou

      didou - 2012-08-12

      hello,
      please how can I use an index building by the IRToolkit in other program.
      In fact i want browzing the index to extract termfrequency and use it as input in my programm. So what is the file format to use it from the index building by lemur for example?

       
  • Duy Dinh

    Duy Dinh - 2012-08-13

    Hi,
    If you chose Lemur as the indexer, you need to read the documentation of LEMUR in order to understand their data structure and then use the appropriate API to exploit the index.

    In IRToolkit, the version of Lemur is 3.1
    http://www.lemurproject.org/download-archive.php#3.1

    See also: LemurRetriever class to see how to exploit the index structures of Lemur

    Best regards,
    Duy

     

    Last edit: Duy Dinh 2012-08-13

Log in to post a comment.