we found this project very interesting and we aim to use it to develop applications, nevertheless it seems the project is not supported anymore (unanswered threads, bugs not fixed etc.).
Could the administrator clarify the real condition of the project, please?
We find the answer, look at the Tika project: http://incubator.apache.org/tika/
However Tika is only a metadata parser, it's not an indexer itself.
I have not seen activity in the last years. I use it and I support it myself. It is a very simple code.
I recommend you study Lucene. Then Lius is a complement easy to extend.