Showing 110 open source projects for "language processing"

View related business solutions
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 114 This Week
    Last Update:
    See Project
  • 2
    cebe/markdown

    cebe/markdown

    A super fast, highly extensible markdown parser for PHP

    ...It is a set of PHP classes, each representing a Markdown flavor and a command line tool for converting Markdown files to HTML files. The implementation focus is to be fast (see benchmark) and extensible. You are able to add additional language elements by directly hooking into the parser, no (possibly error-prone) post- or pre-processing is needed to extend the language. It is also well-tested to provide the best rendering results also in edge cases where other parsers fail.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Discriminative Language Editor

    Discriminative language editor based on ontologies

    Text editor in Java that is able to detect discriminative expressions while the user is typing. When the internal ontology-based analyzer detects a potential discriminative expression the user is advised by underscoring the related words in the text. A descriptive message about the issue is also shown to the user when the cursor is placed over the potential discriminative expression.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided.
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    Ansj Chinese word segmentation

    Ansj Chinese word segmentation

    Ansj word segmentation

    ...At present, it has realized the functions of Chinese word segmentation, Chinese name recognition, user-defined dictionary, keyword extraction, automatic summarization, and keyword tagging. It can be applied to natural language processing and other aspects, and is suitable for various projects that require high word segmentation effects.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    Notepad3

    Notepad3

    Light-weight Scintilla-based text editor with syntax highlighting

    Notepad3 is a fast and light-weight Scintilla-based text editor with syntax highlighting. Notepad3 is an excellent replacement for the default Windows text editor. Notepad3 offers many extra features over Notepad. It has a small memory footprint, but is powerful enough to handle most programming jobs.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Benkyou Studio

    Benkyou Studio

    Benkyou Studio is a Language study toolkit.

    Benkyou Studio is intended to be a One-stop integrated solution for working and learning with languages, For the Learner, it has flashcards,A multi choice Quiz which remembers and adjusts to the words you are struggling with, Speech Synthesis helps you hear the words as you study, you can even export the wordlist to sound files for your portable music player For the Professional, it has unicode lookup and converters, a character map viewer and exporter, a text file converter for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    PLP

    Powerfull pre-processor

    Powerful Verilog Preprocessor. PLP stands for Perl Pre-processor. Perl is used as "control language" that is embedded in the Verilog code (or any other code) to generate code on the fly. It is used commonly as a Verilog pre-processor but can be used with any target/output language (C, C++, Java, VHDL, plain text etc)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Virastyar

    Virastyar

    Virastyar is an spell checker for low-resource languages

    Virastyar is a free and open-source (FOSS) spell checker. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for processing low-resource languages, especially Persian and RTL languages Publications: Kashefi, O., Nasri, M., & Kanani, K. (2010). Towards Automatic Persian Spell Checking. SCICT. Kashefi, O., Sharifi, M., & Minaie, B. (2013). A novel string distance metric for ranking Persian respelling suggestions. Natural Language Engineering, 19(2), 259-284. ...
    Downloads: 305 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 10
    Velocity Editor Plugin

    Velocity Editor Plugin

    VTL (Velocity Template Language) edit support for the NetBeans IDE.

    Provides basic support to Velocity's *.vm and *.vsl files. Syntax coloring, basic error highlighting and braces matching is achieved with a lexer and parser based on Apache Velocity's 1.6.2 specification and compiled with JavaCC 5.0.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    wordaxe (formerly deco-cow): A hyphenation library for Python. Several hyphenation algorithms: - the pattern-based from TeX/OOO, - by decomposition of compound words for German language. Includes support for paragraph line-breaking with ReportLab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    This project is devoted to the development of natural language processing tools and resources for the Lingala language, which is spoken by tens of millions of people in central Africa.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    ArabicDiacritizer

    ArabicDiacritizer

    An automatic restoration of Arabic diacritic marks

    This is a software of Arabic diacritical marks restoration. It is based mainly on deep architectures using deep neural network. The algorithm generates diacritized text with determined end case. The algorithm is described in detail in: Ilyes Rebai, and Yassine BenAyed 'Text-to-speech synthesis system with Arabic diacritic recognition system', Computer Speech & Language, 2015. We appreciate it very much if you can cite our related work. ************** Installation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    pyWeb Literate Programming Tool

    Literate Programming in pure Python

    pyWeb is a Literate Programming tool that will work with any markup language and any programming langauge. The idea is to allow you to create great documentation with as constraints or limitations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    writeup
    Programming language for converting source documents into HTML or XML. Writeup is a combination of a markup language (similar to markdown) and a macro pre-processing language that enables a formal production system to be set up for documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    bitext2tmx CAT bitext aligner/converter
    A free computer-aided translation / computer-assisted translation (CAT) tool to align and converter bitext into TMX translation memory format to be used in other CAT tools by translators and other language professionals.
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    Colorer Library
    Colorer provides source text syntax highlighting services. It colorizes source codes in editor systems (more than 200 syntaxes). Uses powerful HRC format(XML, RE, context free grammas), allowing to support any language. Available as Eclipse plugin.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20

    Spock Text Editor

    Simply text editor for php, jsp, html, etc...

    This app is designed in Java, so is fully compatible with Win, Mac and Linux 32 or 64 bits. It's a simple and fast text editor and supports: *.txt *.jsp *.php *.c *.h(headers for C language) *.java *.htm/html This is the first version and my first application on java. I hope you like it! See you in version 2! ;-)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    This project provide implementations of spellcheckers in java language. Spellchecker implementations for TinyMCE based on Jazzy and google-spellchecker-service Authors: Rich Irwin, Andrey Chorniy You may see integration details here https://achorniy.wordpress.com/2009/08/11/tinymce-spellchecker-in-java/ and here https://achorniy.wordpress.com/tag/spellchecker/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    A tool to help finding the corresponding interwikis the when translating a wikipedia article from a given language to another one.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    JPDF Tools
    JPDF Tools is a GUI java program built on the JPDF Export library. Its main aim is to create pdf files by inserting texts, images or tables. Users can also merge PDF files, split PDF files, merge images into PDF files and soon convert from and to PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    EncTool

    Command line tool to detect and convert files encoding.

    Command line tool to detect and convert files encoding. Works with files or directories. Can be used to add or remove UTF-8 BOM. Multi-platform. EncTool requires Java 1.5 or highter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DPRK pull is a script that pulls the English language North Korean news articles from the KCNA website and puts them into one file for reading by a Text to Speech program.
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo