Showing 106 open source projects for "text batch processing tools"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries:...
    Leader badge
    Downloads: 300 This Week
    Last Update:
    See Project
  • 2
    XMLStarlet is a set of command line utilities (tools) to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for text files with UNIX grep, sed, awk, diff, patch, join, etc utilities.
    Leader badge
    Downloads: 1,132 This Week
    Last Update:
    See Project
  • 3
    aeneas

    aeneas

    Automagically synchronize audio and text (aka forced alignment)

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment). aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    Texinfo Web Publisher

    Texinfo Web Publisher

    Multi-format web publishing system based on Texinfo

    Texinfo Web Publisher is a Makefile based publishing system featuring simultaneous con- tent creation into HTML, non-split HTML, Framed HTML, HTML Zip, XML, DocBook, PDF, DjVu, PostScript, DVI, Plain text, Info and EPUB book formats. All Texinfo Web Publisher output formats are from a single source. Texinfo Web Publisher can be used for website creation has FTP deployment capabilities and supports Cascading Style Sheets (CSS). Texinfo Web Publisher is a low maintenance solution for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    The tlve program is a command-line tool for parsing different tlv (tag-length-value) structures and for printing them in various text-based formats. tlve is developed in GNU/Linux environment and it is distributed under GPL.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    JCLALtext

    Text processing module for JCLAL

    JCLALtext is a class library designed to extend the framework JCLAL text tasks. JCLALtext is free, open source and developed with the Java programming language. JCLALtext is distributed under the GNU license. The researcher can use the class library by adding it to your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    isbntools

    A command line tool to extract, transform and get metadata for ISBNs

    As of 2015-06-02, this project is no longer under active development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    grepp

    An ultimate text-analysing tool

    A command line tool for text file analyis, filtering, splitting and reporting. Runs under Java (1.5+), supports plugins written in Groovy. Has nix and win batch files in distributions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    SEO & SEM - Marketing Text Writer

    SEO & SEM - Marketing Text Writer

    Open Source SEO & SEM Text Creation Tools for free Article Writer

    Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    bitext2tmx CAT bitext aligner/converter
    A free computer-aided translation / computer-assisted translation (CAT) tool to align and converter bitext into TMX translation memory format to be used in other CAT tools by translators and other language professionals.
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 13
    WM Hyperintensities Segmentation Toolbox

    WM Hyperintensities Segmentation Toolbox

    Open Source White Matter Hyperintensities Segmentation Toolbox

    Wisconsin White Matter Hyperintensity Segmentation [W2MHS] and Quantification Toolbox is an open source MatLab toolbox designed for detecting and quantifying White Matter Hyperintensities (WMH) in Alzheimer’s and aging related neurological disorders. WMHs arise as bright regions on T2- weighted FLAIR images. They reflect comorbid neural injury or cerebral vascular disease burden. Their precise detection is of interest in Alzheimer’s disease (AD) with regard to its prognosis. Our toolbox...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JPDF Tools
    JPDF Tools is a GUI java program built on the JPDF Export library. Its main aim is to create pdf files by inserting texts, images or tables. Users can also merge PDF files, split PDF files, merge images into PDF files and soon convert from and to PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The DocBook Publishing Utilities tools, which make creation and publishing of DocBook easier. The tools are: Maven plug-in to Transform HTML into XML (use after docbkx); Eclipse DocBook table editor; Eclipse wizards for initial DocBook files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Dvipdfm tool for SCons

    SCons tool to cooperate with dvipdfm program

    SCons is a make replacement providing a range of enhanced features such as automated dependency generation and built in compilation cache support. SCons rule sets are Python scripts so as well as the features it provides itself SCons allows you to use the full power of Python to control compilation. This is a SCons extension (tool) which enables usage of the dvipdfm program to convert dvi files to pdf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    OmegaT+ CAT Tools
    A translation tools suite for Computer-Aided Translation / Computer-Assisted Translation (CAT). A translation processor with translation memory, machine translation and project support, bitext aligner/converter, TMX validator, and others.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    almtools

    almtools

    Collection of Open Source tools for HP ALM

    Collection of Open Source and free tools to be used for HP ALM administration, customization, and also end-user usage. This is a community effort. Feel free to use, share and contribute back!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Trim Lines

    Trim Lines removes trailing whitespace from source code files

    This is simple command line tool to batch process source code files to remove trailing whitespaces and convert all line endings to your system native style. Usage example: trimlines d:\Projects\SomeProject\src *.c;*.cpp;*.h;*.hpp;*.inc .svn;.git With this command all files in "d:\Projects\SomeProject\src" including sub-folders that match search masks "*.c;*.cpp;*.h;*.hpp;*.inc" and excluding specified folders ".svn;.git" will be processed. Can process unicode and system native...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A collection of DITA map and topic files used for checking the performance of tools such as the DITA-OT used to convert DITA to other formats, including recommended PIs for dealing with presentation needs not covered in the DITA specification. Primary host is now github, https://github.com/jeremygriffith/DITA-Test-Suite
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    OmegaT+ Computer Assisted Translation (CAT) tools platform that includes OmegaT+ (translation processor), bitext2tmx (aligner/TMX editor), and Validator (TMX validation).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    XML Editor for www.xical.org
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Task for ANT to produce documentation with (PDF)LaTeX using BibTeX, Makeindex and GlossTeX.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Tools to update all pages of a web site, at a command. Header and menu may be copied on each page. It is possible also to add a header to all source files of a project (a licence for example).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A collection of open source libraries and tools that provide solutions for common problems in processing Arabic text, especially in web applications. text normalization, phrase segmentation, text indexing, stop word lists, common spelling mistakes.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB