Lack of multi-application text corpus despite of the surging text data is a serious bottleneck in the text mining and natural language processing especially in Persian language. This project presents a new corpus for NEWS articles analysis in Persian called Persica. NEWS analysis includes NEWS classification, topic discovery and classification, category classification and many more procedures. Dealing with NEWS has special requirements and first of all a valid and reliable corpus to perform the experiments on them.
Please use this reference to cite us:
@inproceedings{eghbalzadeh2012persica,
title={Persica: A Persian corpus for multi-purpose text mining and Natural language processing},
author={Eghbalzadeh, Hamid and Hosseini, Behrooz and Khadivi, Shahram and Khodabakhsh, Ali},
booktitle={Telecommunications (IST), 2012 Sixth International Symposium on},
pages={1207--1214},
year={2012},
organization={IEEE}
}
Persica-A new Persian corpus for NLP
This project presents a new corpus for NEWS text analysis in Persian
Brought to you by:
eghbalz
Downloads:
4 This Week