Showing 82 open source projects for "text batch processing tools"

View related business solutions
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Botpress

    Botpress

    Dev tools to reliably understand text and automate conversations

    We make building chatbots much easier for developers. We have put together the boilerplate code and infrastructure you need to get a chatbot up and running. We propose you a complete dev-friendly platform that ships with all the tools you need to build, deploy and manage production-grade chatbots in record time. Built-in Natural Language Processing tasks such as intent recognition, spell checking, entity extraction, and slot tagging (and many others). A visual conversation studio to design...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    Batch Insert License

    Batch Insert License

    Prepend a copyright/license declaration to many files at once.

    Batch Insert License is used to prepend a block of text, describing the applicable software license, to many source files automatically. It can be used to: - Add a specified license comment block to the top of each source file in your project. - Replace an existing license comment block with a specified new license comment block in each source file. - Delete an existing license comment block from each source file. It is written in Java and will run on any OS that has Java is installed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    VideoSrt

    VideoSrt

    Windows-GUI

    ...VideoSrtIt is written in Golanglanguage and developed based on lxn/walk Windows-GUI toolkit. Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. ...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 4
    Buzz is a fast graphical editor for XML files with special support for OPML. Using the OPML convergence tools it will edit about any outline and many forms of indented text, including Python. In fact, Buzz was written with Buzz! It is written in P
    Downloads: 4 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Next Generation Programming

    Next Generation Programming

    Compose Software Without Writing Any Programing Code

    "Next Generation Programming - Programming Without Coding Software" is a drag-drop wizard for creating simple or complex applications without writing any programming language code The Software is coded/designed with "Java Programming Language" for novice/expert programmers; Programmers can write softwares with visual tools : drag-drop components;visual editors... Programmers can use the software to compose of simple/complex applications : Database programs, circuit design, generate...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    SpringAll

    SpringAll

    Step by step, learn Spring Boot, Spring Boot & Shiro, Spring Batch

    SpringAll is a comprehensive learning project that gathers a wide range of Spring, Spring Boot, and Spring Cloud demos in one repository. It is designed for developers who want to deepen their understanding of the Spring ecosystem by exploring concrete, runnable code samples. Each module focuses on a specific technology or integration—covering web applications, ORM frameworks, microservices, caching, messaging, security, distributed systems, and monitoring. The repository emphasizes both...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    RadonDB

    RadonDB

    RadonDB is an open source, cloud-native MySQL database

    RadonDB is a cloud-native database based on MySQL, and architected in fully distributed cluster that enable unlimited scalability (scale-out), capacity and performance. It supported distributed transaction that ensure high data consistency, and leveraged MySQL as storage engine for trusted data reliability. RadonDB is compatible with MySQL protocol, and sup-porting automatic table sharding as well as batch of automation feature for simplifying the maintenance and operation workflow. RadonDB...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    fastNLP

    fastNLP

    fastNLP: A Modularized and Extensible NLP Framework

    fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc..
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Pure Bash Bible

    Pure Bash Bible

    A collection of pure bash alternatives to external processes

    pure-bash-bible is a collection of pure Bash scripting techniques that demonstrate how to accomplish common and complex tasks using only built-in Bash features. Its goal is to reduce reliance on external tools like sed, awk, or grep, which can slow down scripts and add unnecessary dependencies. The project is organized as a reference book of function-based code snippets, each showcasing practical solutions for string manipulation, text processing, file operations, and more. By relying exclusively on Bash built-ins, these methods can make scripts faster, more portable, and easier to maintain. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    Moritz

    Moritz

    transfer xml into specific text-formats (html, dot, source-code, ...)

    Moritz is an "addon" to the well known tool doxygen. It generates nassi shneiderman diagramms of functions and methods in a c/c++ source as html-files, which could be included in a software-dokumentaion or simple whached by using a html-browser.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    CSVfix

    CSVfix

    Command-line tool specifically designed to deal with CSV data

    ...Unfortunately, the CSV files you are given, or are required to produce, never seem to be in quite the right format for your particular business application. And because of the structure of CSV records, using standard text processing tools like sed, awk and perl is not as simple as it might be. Usage: http://csvfix.byethost5.com/csvfix15/csvfix.html?csvfix.html?Usage.html?i=1&i=2 CSVfix aims to provide a solution to these problems. It is a command-line stream editor specifically designed to deal with CSV data. With it you can, among other things:
    Downloads: 74 This Week
    Last Update:
    See Project
  • 12
    MuLanPa

    MuLanPa

    transfer text in diverse formats into specific xml parser-trees

    MuLanPa is a source-analyser with a configurable parser and may be may be used for several programming-languages. Its xml-output should be used for tools like project-browsers or code-viewers like moritz (www.sourceforge.net/projects/moritz/) .
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AhoCorasickDoubleArrayTrie

    AhoCorasickDoubleArrayTrie

    An extremely fast implementation of Aho Corasick algorithm

    AhoCorasickDoubleArrayTrie is a Java implementation of the Aho–Corasick multi-pattern matching algorithm that is optimized using a Double-Array Trie data structure. It is designed for fast keyword scanning across large texts, where you want to search for many patterns simultaneously and efficiently. The core idea is to build an automaton from a dictionary of patterns, then stream through input text to emit matches with minimal overhead. By using a double-array trie representation, the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    PyTorch Natural Language Processing

    PyTorch Natural Language Processing

    Basic Utilities for PyTorch Natural Language Processing (NLP)

    PyTorch-NLP is a library for Natural Language Processing (NLP) in Python. It’s built with the very latest research in mind, and was designed from day one to support rapid prototyping. PyTorch-NLP comes with pre-trained embeddings, samplers, dataset loaders, metrics, neural network modules and text encoders. It’s open-source software, released under the BSD3 license. With your batch in hand, you can use PyTorch to develop and train your model using gradient descent. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 200 This Week
    Last Update:
    See Project
  • 16
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 108 This Week
    Last Update:
    See Project
  • 17
    Betty

    Betty

    Holberton-style C code checker written in Perl

    Betty is a Perl-based coding style checker that enforces the Holberton School coding style (inspired by the Linux kernel style) for C code and documentation. It identifies inconsistencies, style violations, and formatting issues in C source files. You should be aware that by default, some text editors are using spaces instead of tabs. For instance, when you press tab key on emacs, by default, leading spaces will be put, and that will cause Betty to raise a lot of warnings. Please find some...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    XMLStarlet is a set of command line utilities (tools) to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for text files with UNIX grep, sed, awk, diff, patch, join, etc utilities.
    Leader badge
    Downloads: 1,116 This Week
    Last Update:
    See Project
  • 19
    aeneas

    aeneas

    Automagically synchronize audio and text (aka forced alignment)

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment). aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    JCLALtext

    Text processing module for JCLAL

    JCLALtext is a class library designed to extend the framework JCLAL text tasks. JCLALtext is free, open source and developed with the Java programming language. JCLALtext is distributed under the GNU license. The researcher can use the class library by adding it to your project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    grepp

    An ultimate text-analysing tool

    A command line tool for text file analyis, filtering, splitting and reporting. Runs under Java (1.5+), supports plugins written in Groovy. Has nix and win batch files in distributions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    SEO & SEM - Marketing Text Writer

    SEO & SEM - Marketing Text Writer

    Open Source SEO & SEM Text Creation Tools for free Article Writer

    Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    bitext2tmx CAT bitext aligner/converter
    A free computer-aided translation / computer-assisted translation (CAT) tool to align and converter bitext into TMX translation memory format to be used in other CAT tools by translators and other language professionals.
    Leader badge
    Downloads: 13 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB