Page 3 | text processing free download

Morfologik

ATTENTION! Morfologik is now at GitHub: https://github.com/morfologik/

1 Review

Downloads: 0 This Week

Last Update: 2015-09-10

See Project

Virastyar

Virastyar is an spell checker for low-resource languages

Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering,...

14 Reviews

Downloads: 56 This Week

Last Update: 2020-03-05

See Project

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 0 This Week

Last Update: 2015-10-06

See Project

Java Data Mining Package

The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).

Downloads: 0 This Week

Last Update: 2015-08-19

See Project

JInsect

The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classiﬁcation and indexing.

3 Reviews

Downloads: 0 This Week

Last Update: 2015-08-25

See Project

Stemmer Gujarati

Offline stemmer for Gujarati , which is one of 22 Indian languages.

...There has been lot of significant work in the development and evaluation of stemmer for non-Indian languages, but very less or no significant work has been done on Indian front especially for Gujarati language.The code of this stemmer is based on algorithm designed under guidance of Prof. Nikita Desai, India. It takes input file of type .txt containing Gujarati text encoded as UTF-8 and then removes stop words which are unessential. After processing rest of the words, it outputs corresponding file containing all stem words plus other details.

Downloads: 0 This Week

Last Update: 2015-04-05

See Project

Text Analyzer

Text analyzing software

An application developed in C using the list and the AVL tree data structures, which analyzes a text (.txt file) giving the following information as an output: 1. the total occurrences of every word in the text 2. the exact line of every occurrence of every word 3. the exact position in the line of every occurrence of every word 4. the exact paragraph of every occurrence of every word 5. the exact sentence of every occurrence of every word The output is also written in a...

Downloads: 0 This Week

Last Update: 2014-11-05

See Project

ArabicDiacritizer

An automatic restoration of Arabic diacritic marks

This is a software of Arabic diacritical marks restoration. It is based mainly on deep architectures using deep neural network. The algorithm generates diacritized text with determined end case. The algorithm is described in detail in: Ilyes Rebai, and Yassine BenAyed 'Text-to-speech synthesis system with Arabic diacritic recognition system', Computer Speech & Language, 2015. We appreciate it very much if you can cite our related work. ************** Installation...

Downloads: 0 This Week

Last Update: 2014-12-16

See Project

Lingala NLP

This project is devoted to the development of natural language processing tools and resources for the Lingala language, which is spoken by tens of millions of people in central Africa.

Downloads: 0 This Week

Last Update: 2014-11-13

See Project

Khawas

An Arabic Corpora Processing Tool

The new version is available at https://sourceforge.net/projects/ghawwasv4/

Downloads: 1 This Week

Last Update: 2014-08-02

See Project

T.H.O.R.I.U.M.

T.H.O.R.I.U.M. - Thermooptic radiation iterative universal module.

The purpose of this project is to develop open source, precise, fast and easy-to-use software for radiation heat transfer analysis.

Downloads: 0 This Week

Last Update: 2016-10-27

See Project

SetFon Speech Analyzer - Web Praat

SetFon focus is an interface web based for Praat resources (www.praat.org) wich focus speech sound annalysis; it is a gerent program for acoustic analysis PHP/Mysql based. Developed with the framework SIMP.

Downloads: 0 This Week

Last Update: 2015-11-13

See Project

BioNLP UIMA Component Repository

The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.

Downloads: 0 This Week

Last Update: 2014-07-09

See Project

FALCON - Text Search Java Project

JSON based text search Java Project

----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...

Downloads: 0 This Week

Last Update: 2014-04-18

See Project

Bermuda Text-to-Speech

This project includes basic NLP and DSP techniques for Text-to-Speech

See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.

Downloads: 0 This Week

Last Update: 2014-03-24

See Project

HAWK - PDF Text Search Java Project

No more support for this project - TAKE A LOOK AT FALCONSEARCH

No more support for this project - TAKE A LOOK AT FALCONSEARCH "https://sourceforge.net/projects/falcontextsearch/"

Downloads: 0 This Week

Last Update: 2014-04-19

See Project

TF-IDF Measure

TF-IDF.jar is a Java Archive file to measure TF-IDF of each document in a document collection (corpus). The jar can be used to (a) get all the terms in the corpus (b) get the document frequency (DF) and inverse document frequency (IDF) of all the terms in the corpus (c) get the TF-IDF of each document in the corpus (d) get each term with their frequency (no. of presence), term frequency (TF) and TF-IDF in every document

Downloads: 0 This Week

Last Update: 2015-12-17

See Project

Knowtator

Knowtator is a general-purpose text annotation tool that is integrated with the Protégé knowledge representation system. Knowtator facilitates the manual creation of training and evaluation corpora for a variety of biomedical language processing tasks.

Downloads: 0 This Week

Last Update: 2013-11-08

See Project

gannu

Java API and tools for performing NLP and other AI tasks

Java API and tools for performing a wide range of AI tasks such as: word sense disambiguation (released), optimization (5 Evolutionary Algorithms Implemented ETA February 2014), opinion mining (ETA November 2014) and text wikification (ETA July 2014). Gannu includes some graphical interfaces for scientific purposes. When using Gannu please cite: *Jiménez, F. V., Gelbukh, A. F. & Sidorov, G. (2013). Simple Window Selection Strategies for the Simplified Lesk Algorithm for Word Sense...

Downloads: 0 This Week

Last Update: 2013-12-16

See Project

An ethernet sniffer for BrainNet36®

An ethernet sniffer for the EEG acquisition system BrainNet36®

...BrainNet36® has 36 channels, A/D converters with 16 bit accuracy, conversion time of 10 µs and Ethernet communication interface. Being a device for clinical purposes, BrainNet36® does not export data online. This sniffer was developed to allow online processing by working in promiscuous mode and recording data in a plain text file.

Downloads: 0 This Week

Last Update: 2015-01-01

See Project

BioLemmatizer

Lemmatization tool for morphological analysis of biomedical literature

...If you use the BioLemmatizer to support academic research, please cite the following paper: Haibin Liu, Tom Christiansen, William A Baumgartner Jr, and Karin Verspoor BioLemmatizer: a lemmatization tool for morphological processing of biomedical text Journal of Biomedical Semantics 2012, 3:3.

Downloads: 0 This Week

Last Update: 2013-10-23

See Project

Transformation-Based Learning in Java

Java application for training and deploying text processing applications such as part-of-speech taggers, based on a re-implementation of Brill's algorithm in Java.

Downloads: 0 This Week

Last Update: 2014-04-23

See Project

LinqYedict

Translate Chinese to English

Translate Chinese to English using CEDICT (cantonese dictionary). Demonstrate the speed of C# and Linq. Copy the chinese text from any browser/application to Windows clipboard and see the translation.

Downloads: 0 This Week

Last Update: 2015-11-21

See Project

BioDare

BioDare is Biological Data Repository focused on timeseries data

BioDare (Biological Data Repository) was developed under the multi-site ROBuST project (http://hallidaylab.bio.ed.ac.uk/ROBuST.html) to support data exchange inside the project. It is a web application which allows data-sharing (including public dissemination), data-processing and analysis, with the main focus on time-series data produced in circadian experiments. The main features of BioDare are: - an online repository for experimental data accompanied by extensive metadata - generation of secondary data (normalized, detrended, averaged …) - graphical output of data, secondary data and rhythm analysis - simple text-based search throughout metadata - biology- and conditions-aware search for data - data aggregation and export - group-based privacy settings for collaborative research

Downloads: 1 This Week

Last Update: 2013-09-19

See Project

latexdiff

latexdiff is a Perl script, which compares two latex files and marks up significant differences between them (i.e. a diff for latex files). Various options are available for visual markup using standard latex packages such as "color.sty".

Downloads: 0 This Week

Last Update: 2014-06-09

See Project

Search Results for "text processing" - Page 3

Showing 151 open source projects for "text processing"

Morfologik

Virastyar

Modular Audio Recognition Framework

Java Data Mining Package

JInsect

Stemmer Gujarati

Text Analyzer

ArabicDiacritizer

Lingala NLP

Khawas

T.H.O.R.I.U.M.

SetFon Speech Analyzer - Web Praat

BioNLP UIMA Component Repository

FALCON - Text Search Java Project

Bermuda Text-to-Speech

HAWK - PDF Text Search Java Project

TF-IDF Measure

Knowtator

gannu

An ethernet sniffer for BrainNet36®

BioLemmatizer

Transformation-Based Learning in Java

LinqYedict

BioDare

latexdiff

Search Results for "text processing" - Page 3

Showing 151 open source projects for "text processing"

Morfologik

Virastyar

Modular Audio Recognition Framework

Java Data Mining Package

JInsect

Stemmer Gujarati

Text Analyzer

ArabicDiacritizer

Lingala NLP

Khawas

T.H.O.R.I.U.M.

SetFon Speech Analyzer - Web Praat

BioNLP UIMA Component Repository

FALCON - Text Search Java Project

Bermuda Text-to-Speech

HAWK - PDF Text Search Java Project

TF-IDF Measure

Knowtator

gannu

An ethernet sniffer for BrainNet36®

BioLemmatizer

Transformation-Based Learning in Java

LinqYedict

BioDare

latexdiff

Related Searches

Related Categories