#221 Memory leak in Analyzer::reusableTokenStream() call

open
nobody
core (32)
9
2013-04-04
2013-04-04
Xiaoman Dong
No

In file src\core\CLucene\index\DocumentsWriterThreadState.cpp, function void DocumentsWriter::ThreadState::FieldData::invertField()
around line 892, the stream = analyzer->reusableTokenStream(fieldInfo->name, reader); call is supposed to create a stream "reusable", but most of the analyzers are just creating a new stream.

For now I am not sure how to implement the reusable Token stream correctly (maybe should read latest Lucene code), but in my local build I just delete the stream and the memory leak is gone.

Discussion

  • Xiaoman Dong
    Xiaoman Dong
    2013-04-04

    • priority: 5 --> 9
     
  • Xiaoman Dong
    Xiaoman Dong
    2013-04-04

    I would like to help with this issue and borrow ideas from Lucene latest development.

    The multi-thread support is a good improvement for my project. The bottleneck are actually string inverting and using more threads will reduce time cost.