The attached patch provides two changes that together significantly improve the performance of building the inverted index:
1. Built-in external-sort implementation for the postings file (instead of using the system's sort command).
2. Replace a linear search with a binary one for a file index in invmake().
On my machine, these changes reduce the time of building a new database for the Linux kernel (~35K files) from over 52 seconds to 33 (with a warm cache).
Log in to post a comment.