The project aim is to design and implement a text indexing engine which can be easily integrated with an information retrieval system. The design should take care of the specificities of the Arabic language. The indexing is based on a text segmentation technique which has been initially investigated three years ago. The outcome of the project is a portable, free and open source library usable within a recognized Information Retrieval system such as Xapian (http://xapian.org/) or Lucene (http://lucene.apache.org/java/docs/index.html). More attention will be paid to the Old traditional Arabic vocabulary and documents.
Be the first to post a review of Traditional Arabic Documents Indexing!