Perstem is a Persian (Farsi) stemmer, morphological analyzer, transliterator, and partial part-of-speech tagger. Inflexional morphemes are separated or removed from their stems. Perstem can also tokenize and transliterate between various character set encodings and romanizations.
Features
- Stems
- Analyzes Morphology
- Accepts & Transliterates between UTF-8, Windows-1256, ISIRI-3342, HTML-style Numeric Character References, ArabTeX romanization, and Dehdari transliteration
- Displays Part-of-Speech Tags for Many Words
- Tokenizes
- Handles Irregular Verbs, Semi-Regular Verbs, and Many Broken Plurals
- Very Fast
- Small Single File, Requiring no External Data
License
GNU General Public License version 3.0 (GPLv3)Follow Perstem
You Might Also Like
BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
Rate This Project
Login To Rate This Project
User Reviews
There are no 2 star reviews.