Offline stemmer for Gujarati , which is one of 22 Indian languages.
This is a Gujarati stemmer in Java. Stemming is a process in which affixes are removed form the root word (stem). It relates morphological variant words to corresponding common root. For example "પ્રતિઉપયોગી" is word which has stem " ઉપયોગ". Stemmers are language specific tools. The design of a stemming algorithm requires a significant level of linguistic expertise. There has been lot of significant work in the development and evaluation of stemmer for non-Indian languages, but very less...
This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.
Maui is a multi-purpose automatic topic indexing algorithm. Given a document, Maui automatically identifies its topics. Depending on the task topics are tags, keywords, keyphrases, vocabulary terms, descriptors or Wikipedia titles.
Supertagging is a process of statistical lexical disambiguation, preprocessing step to parsing, which assigns LTAG tree categories to the lexical items present in the input sentence. Thus, if the input sentence is in the form of a dependency tree, the task of the supertagger is to assign the most probable TAG family to each node and edge in the dependency tree.