This is a Gujarati stemmer in Java. Stemming is a process in which affixes are removed form the root word (stem). It relates morphological variant words to corresponding common root. For example "પ્રતિઉપયોગી" is word which has stem " ઉપયોગ". Stemmers are language specific tools. The design of a stemming algorithm requires a significant level of linguistic expertise. There has been lot of significant work in the development and evaluation of stemmer for non-Indian languages, but very less or no significant work has been done on Indian front especially for Gujarati language.The code of this stemmer is based on algorithm designed under guidance of Prof. Nikita Desai, India. It takes input file of type .txt containing Gujarati text encoded as UTF-8 and then removes stop words which are unessential. After processing rest of the words, it outputs corresponding file containing all stem words plus other details.

Features

  • Offline stemmer for Gujarati language which is one of Indo-Aryan language used in India.
  • Separate class to check for prefix, suffix, substitution & for dictionary check.
  • Can take any file containing Gujarati text & generate output file with stem words for each word in input file.
  • Stop words which don't have much contribution in any language processing or information retrieval task, all removed in preprocessing & then stemmer algorithm is applied on remaining words.

Project Samples

Project Activity

See All Activity >

Follow Stemmer Gujarati

Stemmer Gujarati Web Site

Other Useful Business Software
$300 Free Credits for Your Google Cloud Projects Icon
$300 Free Credits for Your Google Cloud Projects

Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
Start Free Trial
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Stemmer Gujarati !

Additional Project Details

Operating Systems

Windows

Intended Audience

Science/Research

User Interface

Java AWT

Programming Language

Java

Related Categories

Java Machine Translation Software

Registered

2014-09-29