Page 3 | gnu/linux free download

Online Transcription Editor (OTE)

A tool for Visual Transcriptions of biblical texts at INTF and ITSEE

The Online Transcription Editor was developed as part of the joined project "Workspace for Collaborative Editing". It is used for transcriptions at the INTF in Munster and the ITSEE in Birmingham.

Downloads: 0 This Week

Last Update: 2021-03-02

See Project

korpus

Corpus Linguistics Software

Some software for Corpus Linguistics, which includes Corpus Text Editor, Web-based search, etc. This project created for Belarusian Corpus, but can be used for other languages with some adaption.

Downloads: 0 This Week

Last Update: 2021-02-02

See Project

Korean Analyzer Rhino

Parsing Korean words by morpheme and part-of-speech

RHINO parses Korean words by morpheme and part-of-speech. Its dictionaries are based on Korean Modern Tagged Corpus(12 million phrases scale) which was made by Korean government. So it analyses many cases of stems and endings. And the newly developed Dynamic Dictionary Technology can make words to react with their context. That is, a programmed database. For more information see the files in the help folder.

Downloads: 18 This Week

Last Update: 2020-10-11

See Project

Leseratte

Leseratte is a Java parser for German written language. Currently, it contains a German lexicon (based on the Wiktionary), inflexion rules, a grammar and a parser. (Semantics component planned.) Usable as a Java library, also provides a graphical UI.

Downloads: 0 This Week

Last Update: 2020-10-03

See Project

Artha ~ The Open Thesaurus

Artha is a handy thesaurus based on WordNet with distinct features like global hotkey look-up, passive desktop notifications, regular expression based search, etc.. Artha may be used as a free open-source replacement to the proprietary WordWeb Pro.

11 Reviews

Downloads: 73 This Week

Last Update: 2020-07-27

See Project

Autshumato MTWS

Autshumato Machine Translation Web Service

... - Exposed API for all of the services. - Ability to log into the system using your Google or Facebook ID. - All requests are logged by IP. Licensed under the GNU GPL v3 (or later): http://www.gnu.org/licenses/gpl-3.0.txt

Downloads: 0 This Week

Last Update: 2020-07-17

See Project

nlpcr

Natural language processing using coroutines in C

Downloads: 0 This Week

Last Update: 2020-05-27

See Project

EME

Episodic Memory Extractor

The next step after: https://sourceforge.net/projects/aseryla/ Working in progress

Downloads: 0 This Week

Last Update: 2020-05-22

See Project

SimpleLemmatizer

This program is for text lemmatization

It lemmatizes texts based on supplied model. The base model is for slovak texts and is created from Slovak National Corpus, copyright by Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences

Downloads: 0 This Week

Last Update: 2020-03-22

See Project

KSUCCA Corpus

A 50 million tokens corpus of Classical Arabic.

King Saud University Corpus of Classical Arabic (KSUCCA) is a pioneering 50 million tokens annotated corpus of Classical Arabic texts from the period of pre-Islamic era until the fourth Hijri century (equivalent to the period from the seventh until early eleventh century CE), which is the period of pure classical Arabic. The main aim of this corpus is to be used for studying the distributional lexical semantics of The Quran words. However, it can be used for other research purposes, such...

Downloads: 2 This Week

Last Update: 2020-02-19

See Project

Meaning Explorer

A tool for analyzing the words of the Quran

The main purpose of this tool is to help users in extracting syntagmatic relations between words, lemmas and roots available in the Quran; these relations include identifying significant collocates and words’ co-occurrences. In addition, the tool also provides other helpful functionalities that complement the primary purpose, which include a Key Word In Context (KWIC) concordance, in addition to frequency lists of all words, lemmas and roots in the holy Quran. The main intended users of this...

Downloads: 0 This Week

Last Update: 2019-12-03

See Project

Safe Harbor Deidentification

Safe Harbor Deidentification for medical documents

Phalanx - Deidentify Safe Harbor Deidentification Mode of Phalanx is an abridged pipeline of NLP annotators culminating in NER annotators which write output of text offsets. It uses the Safe Harbor deidentification method.

Downloads: 0 This Week

Last Update: 2019-09-10

See Project

TIES

A smart search engine for medical documents

TIES (Text Information Extraction System) is a clinical text search engine that uses Natural Language Processing techniques to extract medical concepts from free text clinical reports. It provides secure de-identified access to this information and has in built collaboration tools and honest broker functionality. It is licensed for academic use under the BSD license. For commercial use please contact Nexi at http://nexihub.com *** NOTICE: this software and forum are no longer...

1 Review

Downloads: 0 This Week

Last Update: 2019-09-09

See Project

UnsupervisedMT

Phrase-Based & Neural Unsupervised Machine Translation

Unsupervised Machine Translation is a research repository that implements both phrase-based SMT and neural MT approaches for translation without parallel corpora. The neural component supports multiple architectures—seq2seq, biLSTM with attention, and Transformer—and allows extensive parameter sharing across languages to improve data efficiency. Training relies on denoising auto-encoding and back-translation, with on-the-fly, multithreaded generation of synthetic parallel data to continually...

Downloads: 1 This Week

Last Update: 1 day ago

See Project

Arabic Corpus

Text categorization, arabic language processing, language modeling

The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...

Downloads: 6 This Week

Last Update: 2019-03-05

See Project

concordia

Powerful search library, best suited for computer-aided translation

Concordia - Roman goddess of agreement. Concordance searcher - tool for translators who need their translations to "agree" with one standard. Concordia is a C++ library for fast text lookup in large corpora. It uses a RAM stored index, which takes up approximately 600MB of memory for a corpus of 2 million sentences. It is based on the idea of a suffix array, enhanced by the presence of other auxiliary data structures. The effects are stunning - Concordia is able to do simple substring...

Downloads: 0 This Week

Last Update: 2019-02-28

See Project

ARARSS

Downloads: 0 This Week

Last Update: 2019-01-01

See Project

Ghawwas_V4

An open source system for Arabic corpora processing

Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character...

1 Review

Downloads: 3 This Week

Last Update: 2018-12-09

See Project

GoogleTranslate

Google Translate Mac Client. All known issues have been fixed and the user experience has been optimized, but there may still be a few bugs. In the new version, no matter which translation engine you use, it will first call the detection language interface of domestic Google Translate. In this case, the traffic of your proxy node is abnormal, which causes the request to be intercepted by Google, and you need to enter the verification code (you can also use + + to open the debugging...

Downloads: 7 This Week

Last Update: 2023-02-17

See Project

Presage

the intelligent predictive text entry platform

Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic...

3 Reviews

Downloads: 202 This Week

Last Update: 2018-10-11

See Project

ParsPort

ParsPort is a parsing tool for the Portuguese language.

...It implements a set of perl scripts and CorpusSearch revision queries that allow to convert a POS-tagged file (CLAWS format) into a parsed file (Penn treebank format). ParsPort requires the installation of CorpusSearch2 and is optimized for UNIX (including macOS) and Linux operative systems. This parsing tool was developed at Centro de Llinguística da Universidade de Lisboa, within the P.S. Post Scriptum project, and is based on the one designed by Beatrice Santorini for the French language. ParsPort users may modify the definition file and the revision queries as required to apply them to other languages.

Downloads: 0 This Week

Last Update: 2018-11-15

See Project

dadosSemiotica

Collecter and manager of semiotica annalisis data

This program is a web application to collect and organize data of text analysis. It works with sets of texts and the analysis are done on portions of the length of a sentence. One of the preprocessing modules is based on CoGroo (A LibreOffice & OpenOffice.org Portuguese Grammar Checker).

Downloads: 0 This Week

Last Update: 2018-11-01

See Project

Fresh Memory

Flashcards application with Spaced Repetition method

Fresh Memory is an application that helps to learn large amounts of any material with Spaced Repetition method. The most important subject is learning foreign words, but Fresh Memory can be also used to learn anything else. The learning data is stored as flash cards and dictionaries. The flash cards may have several fields, and the user controls what combination of fields to learn. The flashcards can have formatted text and images.

2 Reviews

Downloads: 1 This Week

Last Update: 2018-06-27

See Project

KhmerText

Open data for a Khmer language corpus and lexicographic data that can be used for the development of free language tools for Khmer language, such as automatic translators, dictionaries, linguistic analysis tools, etc.

4 Reviews

Downloads: 46 This Week

Last Update: 2018-05-17

See Project

KH Coder

Quantitative Content Analysis or Text Mining

************************************************************ THIS PROJECT IS MOVED. See http://khcoder.net/en for the latest & greatest. You can download this tool from the new home. See you there! ************************************************************

9 Reviews

Downloads: 0 This Week

Last Update: 2018-12-30

See Project

Search Results for "gnu/linux" - Page 3

Showing 348 open source projects for "gnu/linux"

Online Transcription Editor (OTE)

korpus

Korean Analyzer Rhino

Leseratte

Artha ~ The Open Thesaurus

Autshumato MTWS

nlpcr

EME

SimpleLemmatizer

KSUCCA Corpus

Meaning Explorer

Safe Harbor Deidentification

TIES

UnsupervisedMT

Arabic Corpus

concordia

ARARSS

Ghawwas_V4

GoogleTranslate

Presage

ParsPort

dadosSemiotica

Fresh Memory

KhmerText

KH Coder

Search Results for "gnu/linux" - Page 3

Showing 348 open source projects for "gnu/linux"

Online Transcription Editor (OTE)

korpus

Korean Analyzer Rhino

Leseratte

Artha ~ The Open Thesaurus

Autshumato MTWS

nlpcr

EME

SimpleLemmatizer

KSUCCA Corpus

Meaning Explorer

Safe Harbor Deidentification

TIES

UnsupervisedMT

Arabic Corpus

concordia

ARARSS

Ghawwas_V4

GoogleTranslate

Presage

ParsPort

dadosSemiotica

Fresh Memory

KhmerText

KH Coder

Related Searches

Related Categories