Showing 282 open source projects for "java open source"

View related business solutions
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • Run applications fast and securely in a fully managed environment Icon
    Run applications fast and securely in a fully managed environment

    Cloud Run is a fully-managed compute platform that lets you run your code in a container directly on top of scalable infrastructure.

    Run frontend and backend services, batch jobs, deploy websites and applications, and queue processing workloads without the need to manage infrastructure.
    Try for free
  • 1
    Java application for training and deploying text processing applications such as part-of-speech taggers, based on a re-implementation of Brill's algorithm in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    t2t-pipe

    automatic alignment pipeline for parallel treebanks

    The *Tree-to-Tree (t2t) Alignment Pipe* is a collection of python scripts, co-ordinating the process of automatic alignment of parallel treebanks from plain text files with a single call from a unix command line. Supported Languages: DE, FR, EN
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    BioLemmatizer

    Lemmatization tool for morphological analysis of biomedical literature

    The BioLemmatizer is a domain-specific lemmatization tool for the morphological analysis of biomedical literature. It is tailored to the biological domain through integration of several published lexical resources related to molecular biology. It focuses on the inflectional morphology of English, including the plural form of nouns, the conjugations of verbs, and the comparative and superlative form of adjectives and adverbs. README:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • G-P - Global EOR Solution Icon
    G-P - Global EOR Solution

    Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

    With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
    Learn More
  • 5
    Perstem
    Perstem is a Persian (Farsi) stemmer, morphological analyzer, transliterator, and partial part-of-speech tagger. Inflexional morphemes are separated or removed from their stems. Perstem can also tokenize and transliterate between various character set encodings and romanizations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Donatus is an on-going project consisting of Python, NLTK-based tools and grammars for deep parsing and syntactical annotation of Brazilian Portuguese corpora. It includes a user-friendly graphical user interface for building syntactic parsers with the NLTK, providing some additional functionalities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    iGREAT is an open-source, statistical machine translation software toolkit based on finite-state models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    LanguageTool

    Proofreading Software for 20+ Languages

    LanguageTool is an Open Source language/grammar checker. *** THIS REPOSITORY IS OUT OF DATE, see https://github.com/languagetool-org INSTEAD ***
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Web site to translate text from Spanish to a regular Spanish called "espanes". This lenguage adaptation is very useful for learning Spanish because ia a simplified version with less verbal modes, accents enhanced, prefix, infix and suffix reduced....
    Downloads: 0 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 10

    pdf2mp3

    Simply convert your PDF files into audio books

    Summary: Your eyes are tired of looking into the tablet or cell-phone screen reading ebooks? You have difficulty reading from LCD screen specially in a driving vehicle? This software is for you! It converts your PDF files to MP3 audio books. Special Features (Compared to similar projects): Each page is in a separate MP3 file. Created MP3 files have ID3v2 tags showing Book name and page number. Multi-threaded conversion, means all CPU cores will be used thus multiple times faster conversion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Better PO Editor is an editor for .po files, used to generate compiled gettext .mo files which are used by many programs and websites to localize the user interface. It offers great features... It's worth to give it a try! PLEASE NOTE: the project moved to GitHub: see https://github.com/mlocati/betterpoeditor/releases
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    A simulation package for investigating the dynamics of complex controversy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Fast Fuzzy Inference System
    FFIS or Fast Fuzzy Inference System is a portable and optimized implementation of Fuzzy Inference Systems. It supports both Mamdani and Takagi-Sugeno methods. The main idea behind this tool, is to provide case-special techniques rather than general solutions to resolve complicated mathematical calculations. This will lead to have more efficient defuzzification algorithms for Mamdani's model. Most systems in Mamdani's model can be defuzzified in O(n²) or even O(n) time which n is number of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    CRIS-IE-Smoking

    CRIS-IE-Smoking

    GATE based app to extract patient smoking status from free text

    This application was developed by the NIHR Biomedical Research Centre at the Institute of Psychiatry and South London and Maudsley NHS Foundation Trust, in collaboration with the University of Sheffield. Its purpose is to identify the smoking status of a individual, based on text evidence in clinical notes. Currently, it classifies patients as 'current', 'past' or 'never'. It runs on the GATE infrastructure, available at http://gate.ac.uk/. Please contact richard.g.jackson@slam.nhs.uk for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    CoocViewer

    Viewer for co-occurrences and positional co-occurrences

    A Demo is available at: http://coocviewer.sourceforge.net/coocviewer/index.php
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Hermes Natural Language Processing

    A repository of software, documentation and data for NLP

    Hermes is a repository of software, documentation and data for NLP. I am currently adding corpora extracted from Wikipedia (mostrly in Romance languages).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    ValiTerms

    ValiTerms

    Validation of terms in corpus

    ValiTerms is a tool that helps the validation of terms in corpus. It finds their occurrences and allows terminologists to choose if a term is relevant or not. ValiTerms is developed at LIPN (http://www-lipn.univ-paris13.fr), RCLN team. Please consult the wiki for instructions about installation and usage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    NetBeans Dictionaries

    Additional dictionary files for the NetBeans spellchecker.

    Additional dictionary files for the NetBeans spellchecker.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CoSyne Integrated Prototype
    Multilingual Content Synchronization with Wikis: CoSyne is a Research and Technological Development project co-funded by the European Union. Details: http://cosyne.eu
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Various tools for creating annotated parallel corpora including pre-trained tagging and parsing models for various languages, sentence alignment tools and word alignment tools. Uplug also includes a web-based interface for interactive sentence and word alignment and scripts for indexing and querying parallel corpora using the Corpus Work Bench CWB. Download 'uplug-main' first and then add other packages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    miac-p

    Code for syntactic parsing and other NLP apps.

    Code for syntactic parsing and other natural language processing applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    AzConvert is an open source program to convert different scripts of Azerbaijani language (Latin, Arabic and Cyrillic) to each other. It's written in Qt.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Unicode Conversion Gateway is a web-based proxy server to convert some of the Indian language web pages encoded in proprietary encodings into Unicode.Padma, a popular Firefox extension, is extended and reimplemented in PHP to create this proxy server
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    BuckTagger

    User-assisted tool for Arabic stem entry to Buckwalter Morpho Analyzer

    Using rules written in a Drools decision table, BuckTagger determines the correct Buckwalter Tag based on morphological properties of the input, automatically extracted or given by the user. At the moment, BuckTagger is not complete; it can only handle input that is: - Uninflected - In lexical form, i.e., no clitics or affixes. - A Perfect or Imperfect Verb - Preferably the first and before-last letters are diacritized/vocalized. The interface is in Arabic. See the README for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The TebCorp collection is a large thematic modern Persian text collection which consists of XXX GB of text from Tebyan Portal. TebCorp contains more than XXX articles about XXX topics and includes more than XXX total words and XXX distinct words.
    Downloads: 0 This Week
    Last Update:
    See Project