Page 6 | Best Open Source Linguistics Software 2024

Linguistics Software

View 2114 business solutions

Linguistics Clear Filters

Cybersecurity Management Software for MSPs
Secure your clients from cyber threats.

Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.

Learn More
Software Test Automation and RPA Tool
Free and Enterprise Test Tools To Automate Any Application

ZAPTEST is the leading Enterprise software test automation and RPA tool. By skyrocketing efficiency in the software testing process, ZAPTEST will increase your company productivity, reduce costs, and achieve up to 10 X testing ROI.

Learn More
1

BioEvent

This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
2

BioLemmatizer

Lemmatization tool for morphological analysis of biomedical literature

The BioLemmatizer is a domain-specific lemmatization tool for the morphological analysis of biomedical literature. It is tailored to the biological domain through integration of several published lexical resources related to molecular biology. It focuses on the inflectional morphology of English, including the plural form of nouns, the conjugations of verbs, and the comparative and superlative form of adjectives and adverbs. README: https://sourceforge.net/projects/biolemmatizer/files/ The BioLemmatizer 1.2 release adds an optional functionality to normalize British English spellings into American English spellings and then retrieve corresponding lemmas. If you use the BioLemmatizer to support academic research, please cite the following paper: Haibin Liu, Tom Christiansen, William A Baumgartner Jr, and Karin Verspoor BioLemmatizer: a lemmatization tool for morphological processing of biomedical text Journal of Biomedical Semantics 2012, 3:3.

Downloads: 0 This Week

Last Update: 2013-10-23
See Project
3

Board Game Language

Board Game Language (BGL, pronounced "bagel") is a natural language syntax programming language for first-time programmers. It uses board games as a metaphor for programming concepts, with the goal of teaching users the foundations of programming.

Downloads: 0 This Week

Last Update: 2014-06-23
See Project
4

BuckTagger

User-assisted tool for Arabic stem entry to Buckwalter Morpho Analyzer

Using rules written in a Drools decision table, BuckTagger determines the correct Buckwalter Tag based on morphological properties of the input, automatically extracted or given by the user. At the moment, BuckTagger is not complete; it can only handle input that is: - Uninflected - In lexical form, i.e., no clitics or affixes. - A Perfect or Imperfect Verb - Preferably the first and before-last letters are diacritized/vocalized. The interface is in Arabic. See the README for more details. There is much room for development. Feel free to comment.

Downloads: 0 This Week

Last Update: 2014-05-22
See Project
Software Defined Storage
The layered architecture of QuantaStor provides solution engineers with unprecedented flexibility and application design options.

QuantaStor is a unified Software-Defined Storage platform designed to scale up and out to make storage management easy while reducing overall enterprise storage costs.

Learn More
5

C4 - Christian's C++ Code Collection

C4 is a C++ class library for analyzing sound files, particularly spoken and sung phonations. C4 provides features such as frequency analysis, pitch extraction, or calculation of voice quality parameters (e.g. alpha ratio, HNR, jitter, etc.).

Downloads: 0 This Week

Last Update: 2015-03-19
See Project
6

CHALICE

Connecting Historical Authorities with Links, Contexts and Entities. CHALICE is a historic placename gazetteer for the UK, published as Linked Data and linked to other widely-used sources of placename reference information on the semantic web.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
7

CLEiM

Cross Lingual Education in Medicine

CLEiM (Cross Lingual Education in Medicine) is an opensource version of an Intelligent System which extract concepts from medical texts and provides qualified information. It integrates information from various sources. This system has been developed by the Intelligent System Group GSI (http://www.esi.uem.es/gsi/) at UEM University. We do NER (Named Entity Recognition) based on GATE platform. The installation is simple, you can use it as a Web application. It has been tested under apache-tomcat. The original system has been successfully used to carry out active learning activities with medical students. However, it could be interesting in much more knowledge fields.

Downloads: 0 This Week

Last Update: 2014-09-10
See Project
8

CRFSharp

CRFSharp is a .NET(C#) implementation of Conditional Random Field

CRFSharp(aka CRF#) is a .NET(C#) implementation of Conditional Random Fields, an machine learning algorithm for learning from labeled sequences of examples. It is widely used in Natural Language Process (NLP) tasks, for example: word breaker, postagging, named entity recognized, query chunking and so on. CRF#'s mainly algorithm is the same as CRF++ written by Taku Kudo. It encodes model parameters by L-BFGS. Moreover, it has many significant improvement than CRF++, such as totally parallel encoding, optimizing memory usage and so on. Currently, when training corpus, compared with CRF++, CRF# can make full use of multi-core CPUs and only uses very low memory, and memory grow is very smoothly and slowly while amount of training corpus, tags increase. with multi-threads process, CRF# is more suitable for large data and tags training than CRF++ now. For example, in machine with 64GB, CRF# encodes model with more than 4.5 hundred million features quickly.

Downloads: 0 This Week

Last Update: 2015-08-03
See Project
9

CRIS-IE-Smoking

GATE based app to extract patient smoking status from free text

This application was developed by the NIHR Biomedical Research Centre at the Institute of Psychiatry and South London and Maudsley NHS Foundation Trust, in collaboration with the University of Sheffield. Its purpose is to identify the smoking status of a individual, based on text evidence in clinical notes. Currently, it classifies patients as 'current', 'past' or 'never'. It runs on the GATE infrastructure, available at http://gate.ac.uk/. Please contact richard.g.jackson@slam.nhs.uk for support/queries.

Downloads: 0 This Week

Last Update: 2015-02-12
See Project
Easy and Robust Quality Management Software for Smart Businesses
The perfect QMS software and implementation solution for your small company that you can grow into.

Perfect for a small company, TLM has everything you need, and can accelerate the implementation with consulting, training, procedure templates, and document writing/coaching support, plus we have validation templates and a system setting mode so small medical device companies have special features to help manage technical files and 510K submittals.

Learn More
10

CTexT Alignment Interface

Align parallel corpora on sentence level

1 Review

Downloads: 0 This Week

Last Update: 2014-07-14
See Project
11

CTexT Alignment Interface Pro

Align parallel data at sentence level and also automatic creation of .tmx files for use with Autshumato ITE

Downloads: 0 This Week

Last Update: 2015-02-24
See Project
12

Chaski

Distributed phrase-based machine translation training tool based on Hadoop.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
13

Classical Arabic Corpus

A corpus contains more than 1 M distinct Arabic words.

This project has been developed as part of a master thesis named "Edit Distance Adapted to Natural Language Words". The available project consists three parts. First, the corpus gathers more than one million distinct Arab words. Second, the text files of Arabic resources. Third, the index file presents some information about these resources. Additional details about these parts are available in README file.

Downloads: 0 This Week

Last Update: 2016-01-19
See Project
14

CoSyne Integrated Prototype

Multilingual Content Synchronization with Wikis: CoSyne is a Research and Technological Development project co-funded by the European Union. Details: http://cosyne.eu

Downloads: 0 This Week

Last Update: 2013-04-29
See Project
15

Colloquium QDA

A free and open source qualitative ethnographic interview coding tool.

Colloquium QDA is a tool for custom coding and analyzing qualitative ethnographic interviews. To run, make sure you first have JRE 8 or later installed (http://www.oracle.com/technetwork/java/javase/downloads/). Colloquium QDA is an open source cross-platform Java Swing app utilizing an embedded Java DB with Lucene integrated search.

Downloads: 0 This Week

Last Update: 2017-01-23
See Project
16

Communication Supporting System

Downloads: 0 This Week

Last Update: 2015-03-26
See Project
17

Communication Supporting System

Downloads: 0 This Week

Last Update: 2013-05-29
See Project
18

CompE Toolkit

Data Type Converter

CompE Toolkit allows the user to seamlessly convert between binary, decimal, hexadecimal, and 32-bit floating point representation. It uses a simple, user-friendly interface designed for maximum efficiency and minimal clutter.

Downloads: 0 This Week

Last Update: 2015-02-16
See Project
19

ConTextKit

ConTextKit is a Java-based implementation of Wendy Chapman's ConText algorithm for annotating the context of medical documents, specifically the negation, temporality, and experiencer.

Downloads: 0 This Week

Last Update: 2014-06-24
See Project
20

CoocViewer

Viewer for co-occurrences and positional co-occurrences

A Demo is available at: http://coocviewer.sourceforge.net/coocviewer/index.php

Downloads: 0 This Week

Last Update: 2013-11-08
See Project
21

CorpSe

CORPSE (CORPus SEarch) is a powerful search engine written in Java. The aim is to provide an efficient implementation of a word level inverted index search with various cool functions that can be used on very large corpora.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
22

Corpus Toolkit

A text management tool for linguistic purposes...

Downloads: 0 This Week

Last Update: 2017-11-23
See Project
23

Corpus redundancy manager

Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.

Downloads: 0 This Week

Last Update: 2014-06-30
See Project
24

Cross-Language Computational Linguistics

cross-languages resources

AFEWC corpus is a multilingual comparable text articles in Arabic, French, and English languages. Each triple article is related to the same topic (aligned at article level). AFEWC corpus is collected from Wikipedia. The corpus is available for free for research purposes only. It is composed of 40K aligned articles, 91.3M English words, 57.8M French words, 22M Arabic words, 2.8M English unique words, 1.9M French unique words, and 1.5M Arabic unique words. Wikipedia text is available under Creative Commons Attribution-ShareAlike 3.0 License. https://en.wikipedia.org/wiki/Wikipedia:About To cite the corpora: M. Saad, D. Langlois, and K. Smaïli. Extracting Comparable Articles from Wikipedia and Measuring their Comparabilities. Procedia - Social and Behavioral Sciences, 95(0):40 – 47, 2013. ISSN 1877-0428.

Downloads: 0 This Week

Last Update: 2015-09-11
See Project
25

Cunei Machine Translation Platform

Cunei is a data-driven machine translation system that builds dynamic, statistical models based on instances of known translations found in a corpus.

1 Review

Downloads: 0 This Week

Last Update: 2013-06-05
See Project