Text categorization, arabic language processing, language modeling
...Smaili (2005) Comparison of Topic Identification Methods for Arabic Language, RANLP05 : Recent Advances in Natural Language Processing ,pp. 14-17, 21-23 september 2005, Borovets, Bulgary.
More useful references to check:
-------------------------------------------
https://sites.google.com/site/mouradabbas9/corpora
THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/
...Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs).
Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/