Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#216 IndexWriter::optimize() may cause memory leak

open
nobody
None
5
2013-01-14
2013-01-14
porter
No

1. Deleting some documents from the exist index, then close the reader and searcher.
2. Add some new documents into this index use IndexWriter.
3. Call IndexWriter::optimize() and IndexWriter::close().
4. Delete the analyzer(StopAnalyzer).
5. Delete the IndexWriter.

Now, Visual Leak Detector reports two memory leak for StopAnalyzer::reusableTokenStream:
..
streams = _CLNEW SavedStreams(); --->Deleted.
streams->source = _CLNEW LowerCaseTokenizer(reader); --->Not deleted
streams->result = _CLNEW StopFilter(streams->source, true, stopTable); --->Not deleted.
...

I traced the codes, and find IndexWriter::merge() will delete the FieldsReader, FieldsReader has a member variable(fieldsStreamTL), which will delete all TLS data in _ThreadLocal when FieldsReader::~FieldsReader() get called, so the StopAnalyzer::SavedStream is deleted successfully, this is OK.
But when call 'delete analyzer' in test program, it get NULL when call getPreviousTokenStream() in StopAnalyzer::~StopAnalyzer(), so the two members of SavedStream are never get freed.

My solution is that I moved "delete analyzer" before IndexWriter::optimize(), it works well.

You can also get the problem if you index a large number of documents(like 450,000+, let IndexWriter call merge()), also you will get this memory leak if the IndexWriter::merge() is called, but index a small number of documents will not.(because IndexWriter::merge() is not called)

I think the same problem existed in StandardAnalyzer and other analyzers use "SavedStream" to wrapper something.

I am working on Windows7 64bits, Visual Studio 2008, latest version of Virtual Leak Detector.

Thanks.

Discussion