Menu

#210 Incorrect DB purge with --reindexall

current_cvs
closed-fixed
5
2012-09-10
2012-04-22
No

Applies to all versions of LXR, probably since 0.9.3 in 2004!

Statements delete_releases and delete_files have implicit interrelatioship: a "release" is some alias to a base "revision" file. A base "revision" can thus only be deleted when all "releases" pointing to it have been destroyed. In the present implementation, "releases" are first deleted. Then base "revisions" are scanned: at that time, there is not any more "releases" thus the delete statement for table 'files' can't succeed since it can't compared the files' fileid to releases' (which no longer exist).

This leaves the 'files' records in the DB. In the case of the kernel, that amounts to more than 37000 items per version.

When reindexing, it is very likely that the new "revision" for the file will be different (it is almost certain with the VCSes, only slightly probable with plain files). This means the orphaned "revisions" will not be reused and new records will be created. DB grows uselessly and performance is impacted.

Even without that, the whole process is too slow (same order of magnitude as the indexing itself!) since it involves a selective query with references to several tables.

The first issue ('files' table) could be resolved with a reference count.

Discussion

  • Andre-Littoz

    Andre-Littoz - 2012-09-10

    The whole DB cleaning process has been rewritten for release 1.0.

    DB purge is now correct. Performance has not been tested, but should benefit from the DB backend reorganization.

     
  • Andre-Littoz

    Andre-Littoz - 2012-09-10
    • status: open --> closed-fixed
     

Log in to post a comment.