Showing 89 open source projects for "linguistic"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
    Get free CRM
  • 1
    Linguistic Tree Constructor

    Linguistic Tree Constructor

    Syntax tree editor for rapid annotation of existing text

    Linguistic Tree Constructor (LTC) is a tool for drawing lingusitic syntax trees of already-existing text. It is a syntax editor, not a text editor, so the text has to exist already. It is best suited for large-scale, rapid creation of hand-annotated treebanks. The user can define their own node categories, and can label each node with labels, also definable by the user. LTC supports "generic", X-Bar and RRG trees. Supports interlinear texts in SIL SFM format.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 2
    compromise

    compromise

    Modest natural-language processing

    Language is complicated and there's a gazillion words. Compromise is a javascript library that interprets and pre-parses text and makes some reasonable decisions so things are way easier. Compromise tries its best to parse text. it is small, quick, and often good-enough. It is not as smart as you'd think. Conjugate and negate verbs in any tense. Play between plural, singular and possessive forms. Interpret plain-text numbers. Handle implicit terms. Use it on the client-side or as an...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Never Get Blocked Again | Enterprise Web Scraping Icon
    Never Get Blocked Again | Enterprise Web Scraping

    Enterprise-Grade Proxies • Built-in IP Rotation • 195 Countries • 20K+ Companies Trust Us

    Get unrestricted access to public web data with our ethically-sourced proxy network. Automated session management and advanced unblocking handle the hard parts. Scale from 1 to 1M requests with zero blocks. Built for developers with ready-to-use APIs, serverless functions, and complete documentation. Used by 20,000+ companies including Fortune 500s. SOC2 and GDPR compliant.
    Get Started
  • 5

    Linguistic Analyzer

    The Linguistic Analyzer is a tool for corpus analysis and comparison

    The Linguistic Analyzer (Almuhalil Alloghawy) is a free tool designed by a team from Al-Imam Muhammad bin Saud islamic university that can be used for corpus analysis and comparison in terms of the several linguistic characteristics, such as frequency lists generation, concordances, collocation extraction, the difference between two words, and keyword identification.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Lingua

    Lingua

    The most accurate natural language detection library for Java

    Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,811 This Week
    Last Update:
    See Project
  • 8
    IMS Open Corpus Workbench

    IMS Open Corpus Workbench

    Indexing and query tools for very large text corpora

    The IMS Open Corpus Workbench is a collection of tools for managing and querying large text corpora (100 M words and more) with linguistic annotations. Its central component is the flexible and efficient query processor CQP, which can be used interactively in a terminal session, as a backend e.g. from a Perl script, or through the Web-based GUI CQPweb.
    Leader badge
    Downloads: 50 This Week
    Last Update:
    See Project
  • 9
    Fuzzy sets for Ada

    Fuzzy sets for Ada

    Fuzzy sets, logic, numbers; intuitionistic fuzzy sets, fuzzy linguis

    Fuzzy sets for Ada is a library providing implementations of confidence factors with the operations not, and, or, xor, +, and *, classical fuzzy sets with the set-theoretic operations and the operations of the possibility theory, intuitionistic fuzzy sets with the operations on them, fuzzy logic based on the intuitionistic fuzzy sets and the possibility theory; fuzzy numbers, both integer and floating-point with conventional arithmetical operations, and linguistic variables and sets...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Payroll Services for Small Businesses | QuickBooks Icon
    Payroll Services for Small Businesses | QuickBooks

    Save 50% off for 3 months with QuickBooks Payroll when you Buy Now

    Easily pay your team and access powerful tools, employee benefits, and supportive experts with the #1 online payroll service provider. Manage payroll and access HR and employee services in one place. Pay your team automatically once your payroll setup is complete. We'll calculate, file, and pay your payroll taxes automatically.
    Learn More
  • 10
    LaBB-CAT

    LaBB-CAT

    A linguistic annotation store

    LABB-CAT is a browser-based linguistics research tool that stores recordings and regular-expression searchable text transcripts of interviews. The search results, entire transcripts, and media, can be viewed or exported in a variety of format
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Text Encoding Initiative

    Text Encoding Initiative

    TEI produces the TEI Guidelines and associated software

    The TEI is an international and interdisciplinary standard used by libraries, museums, publishers, and academics to represent all kinds of literary and linguistic texts, using an encoding scheme that is maximally expressive and minimally obsolescent.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    gadict

    gadict

    gadict is a small collection of EN to EN/RU/UK dictionaries.

    gadict is a small collection of EN to EN/RU/UK dictionaries. Also project provides additional linguistic information about EN language. All materials are freely accessible (Public domain).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Fuzzy machine learning framework

    Fuzzy machine learning framework

    A library and a GUI front-end for fuzzy machine learning

    Fuzzy machine learning framework is a library and a GUI front-end for machine learning using intuitionistic fuzzy data. The approach is based on the intuitionistic fuzzy sets and the possibility theory. Further characteristics are fuzzy features and classes; numeric, enumeration features and features based on linguistic variables; user-defined features; derived and evaluated features; classifiers as features for building hierarchical systems; automatic refinement in case of dependent features...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SentimentAnalysis-Rick&Morty

    SentimentAnalysis-Rick&Morty

    Rick & Morty Sentiment Analysis - End-of-Degree Project - UNIR

    The remarkable progress in the field of Big Data has driven the development of new technologies in natural language processing and data analysis. Text mining is a fascinating application of data analysis that extracts relevant information from related writings in different linguistic contexts. And therefore, in natural language processing, sentiment analysis and classification stands out as a key application supported by text mining. Through the extraction of information from textual data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Tokenized Text Aligner

    Aligns tokens in two versions of a text with differing tokenization.

    This tool performs token-by-token alignment of two versions of a text with differing tokenization by interpreting the results of a file diff (https://docs.python.org/3/library/difflib.html). It is intended for use in the preparation of annotated linguistic corpora, where differences in tokenization may arise (i) following corrections or modifications to the source text or (ii) through the creation of different layers of annotation (part-of-speech, treebank) requiring different tokenization...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    AhoTTS - TTS for Basque and Spanish

    Text-to-Speech for Basque and Spanish

    Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    XZVoice

    XZVoice

    Free and open source text-to-speech software

    ..., and comprehensively use acoustic parameters and linguistic parameters to establish multiple automatic prediction models based on deep learning. Using massive audio data to train the pronunciation model, the synthetic sound is real, full, cadenced, and expressive, and the MOS score has reached the professional level in the industry.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Apertium: Machine Translation Toolbox

    Apertium: Machine Translation Toolbox

    The free and open-source rule-based machine translation platform

    Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19

    AhoTTS Multilingual, a Multilingual TTS

    Text-to-Speech TTS for Basque, Spanish, Catalan, Galician and English

    Text-to-Speech conversor for Basque, Spanish, Catalan, Galician and English. It includes linguistic processing and built voices for all the languages aforementioned. Its acoustic engine is based on hts_engine and it uses a high quality vocoder called AhoCoder. Developed by Aholab Signal Processing Laboratory: https://aholab.ehu.es/aholab/ http://aholab.ehu.es/ahocoder/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    OLiA

    OWL/DL ontologies for linguistic annotations

    MOVED TO https://github.com/acoli-repo/olia. The Ontologies of Linguistic Annotations (OLiA) provide an OWL/DL taxonomy of data categories as a reference for linguistic annotation (OLiA Reference Model), plus OWL/DL models for a large number of annotation schemes (OLiA Annotation Models) and their relationship to reference data categories (OLiA Linking Models). The OLiA Reference Model itself is linked to community-maintained repositories such as GOLD (http://linguistics-ontology.org...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    TreeForm Syntax Tree Drawing Software

    Syntax Tree Drawing Software (Linguistics)

    TreeForm Syntax tree drawing software is a Linguistic Syntax/Semantics tree drawing editor. Designed for graphical n-ary tree drawing. Mac users can install the software through the new package, but must give authority through "System Preferences" > "Security & Privacy". Windows and Linux users can run the software through the JAR file directly. All users must have Java 8 or higher installed. https://java.com/en/download/
    Leader badge
    Downloads: 197 This Week
    Last Update:
    See Project
  • 22
    MITIE

    MITIE

    MITIE: library and tools for information extraction

    ... Machines[3]. MITIE offers several pre-trained models providing varying levels of support for both English, Spanish, and German trained using a variety of linguistic resources (e.g., CoNLL 2003, ACE, Wikipedia, Freebase, and Gigaword). The core MITIE software is written in C++, but bindings for several other software languages including Python, R, Java, C, and MATLAB allow a user to quickly integrate MITIE into his/her own applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    TBXTools

    A Python class for Terminology Extraction and Management

    TBXTools allows easy and rapid Terminology Extraction and Management. This tool implements both statistical and linguistic methods, along with several utilities to create and manage terminological databases. It is written in Python and uses NLTK (Natural Language Toolkit) The project has moved to Github: https://github.com/aoliverg/TBXTools
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    SLING

    SLING

    A natural language frame semantics parser

    The aim of the SLING project is to learn to read and understand Wikipedia articles in many languages for the purpose of knowledge base completion, e.g. adding facts mentioned in Wikipedia (and other sources) to the Wikidata knowledge base. We use frame semantics as a common representation for both knowledge representation and document annotation. The SLING parser can be trained to produce frame semantic representations of text directly without any explicit intervening linguistic representation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    [ARCHIVAL] The central forum for the MWE community. Share your open-source data sets and MWE extraction tools, exchange ideas on evaluation strategies and further development of the tools, and discuss theoretical definitions and linguistic properties of MWEs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next