Menu

#78 Index version mismatch warning

Next_Release
closed
indri (6)
1
2013-12-12
2013-11-20
No

I ran into a strange problem with Indri when using indexes built with
an older version. I'm not sure exactly when the change occurred, but
if I run Indri with using a default BOW run of queries 701 -- 850 on
the GOV2 collection, there is a big difference when running queries
between v5.0 and v5.5.

Indri 5.5 - default config
map all 0.2326
P_10 all 0.4886
ndcg all 0.4836
ndcg_cut_10 all 0.3995

Indri 5.0 - default config
map all 0.2805
P_10 all 0.5678
ndcg all 0.5578
ndcg_cut_10 all 0.4646

Notes: I did not use the HTML parser in Indri. Rather I used
Boilerpipe to generate plaintext and then wrapped the documents in old
trectext style. So, I went to the manifest file to see that my index
was built using Indri 5.0 a while ago.

Next, I reindexed the text collection using the new v5.5 indexer. Now
when I run IndriRunQuery v5.5, I get the following scores:
map all 0.2816
P_10 all 0.5577
ndcg all 0.5570
ndcg_cut_10 all 0.4579

I suppose that is also a little surprising that the scoring changes
slightly between 5.0 - 5.5 but maybe that is known. I expected them to
be rank equivalent.

I think but have not tested fully that if you build an index using
v5.2 of Indri, the using IndriRunQuery v5.5 returns the same results.

Just thought I should let you know. Maybe Version 5.5 or 5.6 should
print a warning to stderr if the index version is too old? I have to
assume there was some sort of index formatting changes that happened
post 5.0.

Discussion

  • David Fisher

    David Fisher - 2013-11-20

    To ship in 12/2013 release.

    When the major version number is different, or when the major version number is 5 and the minor is less than 3 (special case), or when there is a more than 4 difference in minor versions (2-1/2 years), throw an Exception.

    harvey:~/Development/test$ ../indri/runquery/IndriRunQuery -index=stem_test/ -query=eat
    UtilityThread exiting from exception IndriRunQuery.cpp(611): QueryThread::_initialize
        ../src/Repository.cpp(435): Couldn't open a repository in read-only mode at 'stem_test/' because:
        ../src/Repository.cpp(244): _openIndexes: Couldn't open DiskIndexes because:
        ../src/DiskIndex.cpp(52): _readManifest: Cannot open index created with version 5.2 using my version 5.5. You need to reindex your data with this version.
    query: 0 QueryThread::_initialize exception
    
    harvey:~/Development/test$ ../indri/dumpindex/dumpindex stem_test/ s
    ../src/Repository.cpp(435): Couldn't open a repository in read-only mode at 'stem_test/' because:
        ../src/Repository.cpp(244): _openIndexes: Couldn't open DiskIndexes because:
        ../src/DiskIndex.cpp(52): _readManifest: Cannot open index created with version 5.2 using my version 5.5. You need to reindex your data with this version.
    
     
  • David Pane

    David Pane - 2013-12-12
    • status: open --> closed
     

Log in to post a comment.

MongoDB Logo MongoDB