And one more case where the indexing process leaves behind temporary files: Extracting a tar.gz archive within a rar archive. All files within the tarball will be extracted to a temporary file named something like "/tmp/tzp17230914112986648485.tmp". It looks like the files are generated by the TrueZip library (due to the "tzp" prefix), probably at the following point in TFile.java:
private static void compact(TFile grown) throws IOException {
[...]
final TFile compact = new TFile(createTempFile("tzp", suffix, dir));
The temporary files are alway generated when handling tar.gz files. However, if the tar.gz file is directly on the filesystem or even in a zip archive, the temporary files are deleted correctly. Only if the tar.gz file is stored in a rar archive the temporary files are not deleted.
I would really appreciate if you could find some time to take a look at the archive handling code. As someone new to the codebase it is really hard to understand/debug the details of the indexing/archive handling in the source code.
The attached file will reproduce the problem. After indexing it, there will be a leftover file in /tmp/.
Anonymous
The indexing algorithm is by far the hardest part of the DocFetcher source code. Even I, the person who wrote it 10 years ago, can barely understand it :-/
Again, as I said in my other responses, this temporary file issue is noted as high-priority, but I still don't have time to look into it in the foreseeable future.