Showing 124 open source projects for "text batch processing tools"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    Win32Forth is ANS compatible, Forth language application development system with many tools; Interactive console, integrated extensible debugger, a GUI file editor, hypertext rendering, hyperlinked source files. VIEW <word-name> to explore the many files
    Leader badge
    Downloads: 61 This Week
    Last Update:
    See Project
  • 2
    SpeedEULA

    SpeedEULA

    Magyar szövegszerkesztő

    Sziasztok! Ez egy magyar szövegszerkesztő program lenne! PRO licenc kód: 74HVR-7ENS9-NDH73-HDM48 Hivatalos discord szerver: https://discord.gg/VUw6DkZ
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 200 This Week
    Last Update:
    See Project
  • 4
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 108 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Betty

    Betty

    Holberton-style C code checker written in Perl

    Betty is a Perl-based coding style checker that enforces the Holberton School coding style (inspired by the Linux kernel style) for C code and documentation. It identifies inconsistencies, style violations, and formatting issues in C source files. You should be aware that by default, some text editors are using spaces instead of tabs. For instance, when you press tab key on emacs, by default, leading spaces will be put, and that will cause Betty to raise a lot of warnings. Please find some...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries:...
    Leader badge
    Downloads: 299 This Week
    Last Update:
    See Project
  • 7
    XMLStarlet is a set of command line utilities (tools) to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for text files with UNIX grep, sed, awk, diff, patch, join, etc utilities.
    Leader badge
    Downloads: 1,116 This Week
    Last Update:
    See Project
  • 8
    aeneas

    aeneas

    Automagically synchronize audio and text (aka forced alignment)

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment). aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Texinfo Web Publisher

    Texinfo Web Publisher

    Multi-format web publishing system based on Texinfo

    Texinfo Web Publisher is a Makefile based publishing system featuring simultaneous con- tent creation into HTML, non-split HTML, Framed HTML, HTML Zip, XML, DocBook, PDF, DjVu, PostScript, DVI, Plain text, Info and EPUB book formats. All Texinfo Web Publisher output formats are from a single source. Texinfo Web Publisher can be used for website creation has FTP deployment capabilities and supports Cascading Style Sheets (CSS). Texinfo Web Publisher is a low maintenance solution for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    The tlve program is a command-line tool for parsing different tlv (tag-length-value) structures and for printing them in various text-based formats. tlve is developed in GNU/Linux environment and it is distributed under GPL.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    JCLALtext

    Text processing module for JCLAL

    JCLALtext is a class library designed to extend the framework JCLAL text tasks. JCLALtext is free, open source and developed with the Java programming language. JCLALtext is distributed under the GNU license. The researcher can use the class library by adding it to your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    isbntools

    A command line tool to extract, transform and get metadata for ISBNs

    As of 2015-06-02, this project is no longer under active development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    grepp

    An ultimate text-analysing tool

    A command line tool for text file analyis, filtering, splitting and reporting. Runs under Java (1.5+), supports plugins written in Groovy. Has nix and win batch files in distributions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    SEO & SEM - Marketing Text Writer

    SEO & SEM - Marketing Text Writer

    Open Source SEO & SEM Text Creation Tools for free Article Writer

    Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    sar

    search and replace command line tool

    this command line tool can be used to search and replace text within files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Console based Editor

    Console based Editor

    text editor consol based and an IDE

    Console based EDitor (CED) is console based program which has curses like interface and it has build in IDE .IDE is expandable.any language interface can added It is easy to use.It is part of PD* software and therefor,public domain software.it is highly customizable. ** IT IS PUBLIC DOMAN SOFTWARE **
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    bitext2tmx CAT bitext aligner/converter
    A free computer-aided translation / computer-assisted translation (CAT) tool to align and converter bitext into TMX translation memory format to be used in other CAT tools by translators and other language professionals.
    Leader badge
    Downloads: 13 This Week
    Last Update:
    See Project
  • 20
    WM Hyperintensities Segmentation Toolbox

    WM Hyperintensities Segmentation Toolbox

    Open Source White Matter Hyperintensities Segmentation Toolbox

    Wisconsin White Matter Hyperintensity Segmentation [W2MHS] and Quantification Toolbox is an open source MatLab toolbox designed for detecting and quantifying White Matter Hyperintensities (WMH) in Alzheimer’s and aging related neurological disorders. WMHs arise as bright regions on T2- weighted FLAIR images. They reflect comorbid neural injury or cerebral vascular disease burden. Their precise detection is of interest in Alzheimer’s disease (AD) with regard to its prognosis. Our toolbox...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    JPDF Tools
    JPDF Tools is a GUI java program built on the JPDF Export library. Its main aim is to create pdf files by inserting texts, images or tables. Users can also merge PDF files, split PDF files, merge images into PDF files and soon convert from and to PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The DocBook Publishing Utilities tools, which make creation and publishing of DocBook easier. The tools are: Maven plug-in to Transform HTML into XML (use after docbkx); Eclipse DocBook table editor; Eclipse wizards for initial DocBook files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Dvipdfm tool for SCons

    SCons tool to cooperate with dvipdfm program

    SCons is a make replacement providing a range of enhanced features such as automated dependency generation and built in compilation cache support. SCons rule sets are Python scripts so as well as the features it provides itself SCons allows you to use the full power of Python to control compilation. This is a SCons extension (tool) which enables usage of the dvipdfm program to convert dvi files to pdf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    UltraDBC

    Microsoft SQL Server database data compare tool

    Command line tool, that compares Microsoft SQL Server table data between tables with the same columns in different databases. Original databases are not modified. Results are saved as SQL command file or displayed to the console. Multi threaded, open sourced project, that should fill empty niche for this kind of freeware software. As a command line application it can easily be used in batch processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    OmegaT+ CAT Tools
    A translation tools suite for Computer-Aided Translation / Computer-Assisted Translation (CAT). A translation processor with translation memory, machine translation and project support, bitext aligner/converter, TMX validator, and others.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB