Showing 350 open source projects for "java open source"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    Entity recognition and normalization software for biomedical text
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    Hebrew Deflector

    A proram to de-inflect modern Hebrew words

    Hebrew Deflector tries to guess the root, the pattern and the form of a modern Hebrew word provided by the user. It uses the existing rules of the language to do that, and displays the list of possible answers. It is not a dictionary and it doesn't know whether the word (and the listed forms of it) exists or not. It also doesn't know anything about exception to the rules.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Cross-platform application aimed at helping users to learn vocabulary from any foreign language(s). Add/Edit/Delete vocab words (w/ translation, category, sentence, notes, picture). Review (Quiz) vocabulary words.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5

    diasim

    Dialogue Similarity

    Tools for calculating similarity (including lexical and syntactic) between speakers in dialogue, across standard and randomised corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    texrex

    Web corpus creation software (moved to GitHub)

    This project has moved to GitHub: https://github.com/rsling/texrex https://github.com/rsling/cow
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    bnf2xml

    simple BNF parser makes xml markup of matches

    bnf2xml a simple BNF parser that takes text as input, searches according to a BNF query file, and outputs text marked up by the xml labels that show context. bnf2xml is as simple to use as any text binary ie, awk(1) grep(1). bnf2xml does not require C API because it outputs simple xml labeling. README is visible on file dl page. EXAMPLE: $ echo "hi" | bnf2xml patternfile <word><alph>h</alph><alph>i</alph></word> or <gas>hydrogen iodide</gas> patternfile says how to find...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Part-of-speech tagging is the task of assigning symbols from a particular set to words in a natural language text. ACOPOST implements and extends well-known machine learning techniques and provides a uniform environment for testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    Classical Arabic Corpus

    A corpus contains more than 1 M distinct Arabic words.

    This project has been developed as part of a master thesis named "Edit Distance Adapted to Natural Language Words". The available project consists three parts. First, the corpus gathers more than one million distinct Arab words. Second, the text files of Arabic resources. Third, the index file presents some information about these resources. Additional details about these parts are available in README file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 10
    Software for speech research. It includes programs and libraries for signal processing, along with general purpose scientific libraries. Most of the code is in Python, with C/C++ supporting code. Also, contains code releases corresponding to publishe
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    FREJ
    FREJ stands for "Fuzzy Regular Expressions for Java" - it is a command-line tool and library which allow you easily compare strings with patterns disregarding nasty typos and considering several variants (like "Barack Obama", "B.H.Obama" etc.) Project sources are moved to github: https://github.com/RodionGork/FREJ
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ATTENTION! Morfologik is now at GitHub: https://github.com/morfologik/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    KneeTex is an opensource, stand–alone application for information extraction from narrative reports that describe an MRI scan of the knee. Given an MRI report as input, the system outputs the corresponding clinical findings in the form of JavaScript Object Notation objects. The extracted information is mapped onto TRAK, an ontology that formally models knowledge relevant for the rehabilitation of knee conditions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    C++ template library for modular construction of factored probabilistic time-series models, model trainers, and recognizers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013).
    Leader badge
    Downloads: 60 This Week
    Last Update:
    See Project
  • 16

    mwetoolkit

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

    THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/ The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc. Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics. The mwetoolkit can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    BANNER is a named entity recognition system intended primarily for biomedical text. It uses conditional random fields as the primary recognition engine and includes a wide survey of the best techniques described in recent literature.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18

    eAlign

    A parallel corpora (bitext) aligning tool. Create TMX databases

    (Full support available under superalign.sourceforge.net) Aligning parallel corpora Creating TMX, csv, Tab Delimited TMs Automatic aligning of text Super fast handling of multiple files Very easy GUI handling of files under Windows CAT tool assistant
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    SuperAlign was fully updated as of 15 July 2013 and is now released under the name eAlign as well. A parallel corpora (bitext) aligning tool. Create TMX databases and align translations for Translation Memory databases. Use multiple files in multiple formats to align them with their translations. The full workflow is built in with a GUI interface. SuperAlign-eAlign uses the hunalign algorithm.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    AJ-JpnRa Tool

    AJ-JpnRa Tool

    「AJ-JpnRa Tool」 is Japanese text readability analysis program.

    We temporarily suspend the release of the program due to a patent application. -2020.09 AJ-JpnRa Tool is Japanese text readability analysis program, is mainly ordered by the guidelines of JLPT. You can analyze Japanese-Text Readability with the length and Chinese character level of the text by using the AJ-JpnRa Tool. And Chinese character level is analyzed by the database(AJ-JpnRa Tool), which was built according to essential Chinese character education guideline of Japan elementary...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    AsiEs stands for Asistente de Escritura (writing assistant). It provides word prediction and autocomplete for fast writing. Thought for people with difficulties writing on keyboard, improves the writing speed preventing the user from pressing at most 50% of keys to write and avoids ortographic errors. Made by Fundación Teletón Uruguay (http://www.teleton.org.uy/home/)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Alfanous

    Alfanous

    Quran Search Engine

    Alfanous (The Lantern - الفانوس ) is an Arabic search engine API provide the simple and advanced search in the Holy Quran , more features and many interfaces...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    This tool is made to score machine translation performance with the TER metric. This code is based on Snover's algorithm.
    Downloads: 0 This Week
    Last Update:
    See Project