I got some valuable feedback from Glen Newton, an information retrieval researcher and developer who uses lucene in high performance settings.
He offered to review our code and he points out that our luindex benchmark is a bit naive in serializing IO and indexing activities. A better architecture is to have a buffer and separate producer and consumer threads which respectively do the IO and the indexing. The producer and consumer can themselves be parallelized. This makes better utilization of the (multicore) hardware and allows the benchmark to be less IO dominated.
Our existing code is based on tutorial code from lucene, but Glen's suggestion reflects what industrial applications of lucene are doing, so we should move in that direction for the next release.