Java Linguistics Software

View 202 business solutions

Browse free open source Java Linguistics Software and projects below. Use the toggles on the left to filter open source Java Linguistics Software by OS, license, language, programming language, and project status.

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    WordNet Database in various SQL format
    Downloads: 38 This Week
    Last Update:
    See Project
  • 2
    Entity recognition and normalization software for biomedical text
    Downloads: 47 This Week
    Last Update:
    See Project
  • 3
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Leader badge
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4

    sgmweka

    Weka wrapper for the SGM toolkit for text classification and modeling.

    Weka wrapper for the SGM toolkit for text classification and modeling. Provides Sparse Generative Models for scalable and accurate text classification and modeling for use in high-speed and large-scale text mining. Has lower time complexity of classification than comparable software due to inference based on sparse model representation and use of an inverted index. The provided .zip file is in the Weka package format, giving access to text classification. Other functions are usable through either Java command-line commands or class inclusion into Java projects.
    Downloads: 18 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org). Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en. Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    LaBB-CAT

    LaBB-CAT

    A linguistic annotation store

    LABB-CAT is a browser-based linguistics research tool that stores recordings and regular-expression searchable text transcripts of interviews. The search results, entire transcripts, and media, can be viewed or exported in a variety of format
    Downloads: 9 This Week
    Last Update:
    See Project
  • 7
    Thinknowlogy

    Thinknowlogy

    The world's only naturally intelligent knowledge technology

    Natural intelligence is the utilization of naturally occurring logic. This naturally occurring logic provides concrete clues for organizing natural objects, like: - Grouping objects that belong together, - Separating objects that don't belong together, - Archiving objects that have become less important. Natural language and spatial information are sources of natural intelligence: - Natural language is providing concrete logic for organizing knowledge objects, - Spatial information provides concrete logic for organizing spatial objects (utilized in, e.g., self-driving cars). In this way, our brains know how to organize their knowledge and spatial information. I focus on natural language because this source of natural intelligence is hardly understood by scientists. Hence, the inability of Large Language Models to organize changes in their knowledge independently.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    HermeneutiX

    HermeneutiX

    Your graphical tool for Syntactic/Semantic Structure Analysis of texts

    HermeneutiX is a tool for diagramming syntactic and semantic structures of complex (not necessarily foreign-language) texts (e.g. bible or other historical excerpts). HermeneutiX is now part of SciToS (the scientific tool set). Starting with version 2.0.0, HermeneutiX can be found on GitHub. Please check out the release summary: https://github.com/scientific-tool-set/scitos/releases For an introduction, check out this video: https://youtu.be/uQjewyG0Ad8 PS: To run a Java application such as HermeneutiX (i.e. SciToS) you need a Java Runtime Environment (JRE). HermeneutiX is currently built to be compatible down to JRE version 6. You may download the current JRE here: http://www.java.com/en/download
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Korean Analyzer Rhino

    Korean Analyzer Rhino

    Parsing Korean words by morpheme and part-of-speech

    RHINO parses Korean words by morpheme and part-of-speech. Its dictionaries are based on Korean Modern Tagged Corpus(12 million phrases scale) which was made by Korean government. So it analyses many cases of stems and endings. And the newly developed Dynamic Dictionary Technology can make words to react with their context. That is, a programmed database. For more information see the files in the help folder.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10

    TextComparer

    Small Java program to compare two texts

    Small Java program to compare two texts, originally designed to be used to find quotations in a Byzantine anthology. It can quite likely be used to detect plagiarism between two texts as well Graphical interface which allows easy navigation between corresponding parts in the two different texts. Uses the http://software.jessies.org/salma-hayek/ Java TextArea for this. Probably very much can be done to improve this program and the algorithm which it uses.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    srt-translator

    srt-translator

    Subtitle translator from one natural language to other.

    Translating subtitles in format SubRip from one natural language to other. It is based on Google Translate without API and therefore without payment. Translator have automatic and manual spell checkers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Annoschemer is a little tool for easy editing of MMAX2 annotationschemes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    AraRooter

    Find Arabic Root Word

    Using Machine Learning, AraRooter finds the three-lettered root of any Arabic lemma with around 84% accuracy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Arabic Morphology& Sentacs coding
    This project aimed at creating framework and binary data format for etymological Arabic system. and will not continue hosted at sourceforge because the term of use determine me as enemy, so I am prohibited from using sourceforge services.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Autshumato MTWS

    Autshumato MTWS

    Autshumato Machine Translation Web Service

    Web service providing access to the Autshumato Machine Translation (MT) and other Moses Statistical MT systems. Functionality includes: - Automatic sentence, document, web page translation. - Improvements for translations. - Reviewer requests and interface to review improvements - Connection to the latest version of the Autshumato ITE, Post Edits done on inserted automatic translations are automatically submitted to the MTWS. - Administration interface to add users, reviewers and MT systems. - Exposed API for all of the services. - Ability to log into the system using your Google or Facebook ID. - All requests are logged by IP. Licensed under the GNU GPL v3 (or later): http://www.gnu.org/licenses/gpl-3.0.txt
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    It's a utility application for updating and integrating translation memories, created by the Autshumato ITE, over a network. Licensed under the TMate Open Source License and free to download and be used by anyone.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    BANAL - Banal And Not A Language. A prototyping notation compatible with Java and C# (via the largest possible common footprint between the two).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    BANNER is a named entity recognition system intended primarily for biomedical text. It uses conditional random fields as the primary recognition engine and includes a wide survey of the best techniques described in recent literature.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are interested in reuse, and we focus on common NLP tasks that are broadly useful for textmining.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    BioContext

    Software for extraction of biomedical information from literature

    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    BioLemmatizer

    Lemmatization tool for morphological analysis of biomedical literature

    The BioLemmatizer is a domain-specific lemmatization tool for the morphological analysis of biomedical literature. It is tailored to the biological domain through integration of several published lexical resources related to molecular biology. It focuses on the inflectional morphology of English, including the plural form of nouns, the conjugations of verbs, and the comparative and superlative form of adjectives and adverbs. README: https://sourceforge.net/projects/biolemmatizer/files/ The BioLemmatizer 1.2 release adds an optional functionality to normalize British English spellings into American English spellings and then retrieve corresponding lemmas. If you use the BioLemmatizer to support academic research, please cite the following paper: Haibin Liu, Tom Christiansen, William A Baumgartner Jr, and Karin Verspoor BioLemmatizer: a lemmatization tool for morphological processing of biomedical text Journal of Biomedical Semantics 2012, 3:3.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Board Game Language
    Board Game Language (BGL, pronounced "bagel") is a natural language syntax programming language for first-time programmers. It uses board games as a metaphor for programming concepts, with the goal of teaching users the foundations of programming.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo