Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "text batch processing tools"

x

Sort By:

Relevance

Clear All Filters

OS

Windows 77
Linux 75
Mac 69
More...
BSD 39
ChromeOS 39

Category

Artificial Intelligence 79
Software Development 7
Multimedia 5
Business 2
Scientific/Engineering 2
Education 1

License

OSI-Approved Open Source 70
Creative Commons Attribution License 1
Public Domain 1

Translations

English 2

Programming Language

Python 79
JavaScript 4
C++ 1
C# 1
Java 1
More...
Perl 1
Ruby 1
Unix Shell 1

Status

Production/Stable 5
Pre-Alpha 1
Beta 1

Showing 79 open source projects for "text batch processing tools"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API

...Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction capabilities into a unified API that standardizes the output. The platform supports automated processing pipelines that detect file types and apply the appropriate extraction method to obtain the most accurate text representation possible. It can be integrated into document analysis systems, knowledge retrieval tools, and AI pipelines that rely on clean textual data. ...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
2

abogen

Generate audiobooks from EPUBs, PDFs and text with captions

abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...

Downloads: 4 This Week

Last Update: 2026-02-06
See Project
3

Chandra

OCR model for complex documents with layout-aware structured outputs

...Chandra can be run locally using transformer-based inference or deployed with a high-performance server setup for large-scale processing. It also includes command-line tools and optional web-based interfaces to simplify interaction and batch processing workflows.

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
4

LLM-Aided OCR Project

Enhances Tesseract OCR output using LLMs (local or API)

...This AI-assisted correction process helps reconstruct missing characters, fix formatting mistakes, and produce more coherent text outputs. The project is particularly useful for digitizing historical documents, research papers, and scanned materials where traditional OCR often struggles. It also includes tools for processing batches of images or documents, enabling automated document digitization workflows.

Downloads: 0 This Week

Last Update: 2026-03-22
See Project
Earn up to 16% annual interest with Nexo.
Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
5

OpenMed

Open source healthcare AI

...OpenMed can be used in three main ways: as a simple Python API for scripts and notebooks, as a Docker-friendly FastAPI service for backend integration, and as a batch-processing system for multi-document workflows.

Downloads: 12 This Week

Last Update: 5 days ago
See Project
6

deepdoctection

A Repo For Document AI

DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for...

Downloads: 0 This Week

Last Update: 2026-05-15
See Project
7

AUTOMATIC1111 Stable Diffusion web UI

Stable Diffusion web UI

...The interface also supports prompt editing, batch processing, custom scripts, and many community extensions, making it a highly customizable and continually evolving platform for creative AI art generation.

1 Review

Downloads: 176 This Week

Last Update: 2025-06-02
See Project
8

Umi-OCR

OCR software, free and offline

Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. ...

Downloads: 53 This Week

Last Update: 2026-01-15
See Project
9

Faster Whisper

Faster Whisper transcription with CTranslate2

Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical. It supports multiple model sizes, allowing users to balance speed and accuracy based...

Downloads: 40 This Week

Last Update: 2026-04-06
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
10

Voice-Pro

Comprehensive Gradio WebUI for audio processing

Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.

1 Review

Downloads: 46 This Week

Last Update: 2025-12-05
See Project
11

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...

Downloads: 4 This Week

Last Update: 2026-02-03
See Project
12

SciSpaCy

A full spaCy pipeline and models for scientific/biomedical documents

ScispaCy is a spaCy extension optimized for processing biomedical and scientific text, providing domain-specific NLP models for tasks like named entity recognition (NER) and dependency parsing.

Downloads: 2 This Week

Last Update: 2025-10-01
See Project
13

Hazm

Persian NLP Toolkit

Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.

Downloads: 0 This Week

Last Update: 2026-04-01
See Project
14

Hugging Face - Speech To Speech

Open speech-to-speech models and pipelines by Hugging Face toolkit AI

This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
15

Sygil WebUI

Stable Diffusion web UI

Sygil WebUI is a browser-based interface for running Stable Diffusion image generation locally or on a server, wrapping common text-to-image and image-to-image workflows into a practical UI. It provides multiple UI modes (including a legacy Gradio interface) and focuses on making iterative prompting, parameter tuning, and post-processing accessible without writing code. The UI exposes core generation controls like resolution, CFG guidance, sampling steps, samplers, seeds, and batch generation so users can reproduce results and refine outputs systematically. ...

Downloads: 0 This Week

Last Update: 2026-02-03
See Project
16

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...

Downloads: 7 This Week

Last Update: 2026-01-27
See Project
17

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common...

Downloads: 25 This Week

Last Update: 2026-03-22
See Project
18

Insanely Fast Whisper

An opinionated CLI to transcribe Audio files w/ Whisper on-device

Insanely Fast Whisper is a high-performance command-line tool designed to dramatically accelerate speech-to-text transcription using OpenAI’s Whisper models on local hardware. It leverages modern optimizations such as batch processing, mixed precision, and advanced attention mechanisms like Flash Attention to significantly reduce inference time while maintaining high transcription accuracy. The project is built on top of the Transformers ecosystem and integrates with libraries such as Optimum to maximize GPU efficiency. ...

Downloads: 4 This Week

Last Update: 2026-03-26
See Project
19

Pocket TTS

A TTS that fits in your CPU (and pocket)

...Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.

Downloads: 11 This Week

Last Update: 2026-05-04
See Project
20

MindNLP

Easy-to-use and high-performance NLP and LLM framework

MindNLP is a natural language processing library built on the MindSpore framework, providing tools and models for various NLP tasks.

Downloads: 0 This Week

Last Update: 2025-11-05
See Project
21

DeerFlow

Deep Research framework, combining language models with tools

DeerFlow is an open-source, community-driven “deep research” framework / multi-agent orchestration platform developed by ByteDance. It aims to combine the reasoning power of large language models (LLMs) with automated tool-use — such as web search, web crawling, Python execution, and data processing — to enable complex, end-to-end research workflows. Instead of a monolithic AI assistant, DeerFlow defines multiple specialized agents (e.g. “planner,” “searcher,” “coder,” “report generator”)...

Downloads: 94 This Week

Last Update: 2026-05-18
See Project
22

SwarmUI

Modular AI image and video generation web UI with extensible tools

SwarmUI is a modular web-based user interface designed for AI-driven image generation, with a strong focus on usability, performance, and extensibility. It serves as a unified environment for working with multiple AI models, including Stable Diffusion and newer image and video generation systems, allowing users to create and manage outputs through a browser interface. SwarmUI is built to accommodate both beginners and advanced users by offering a simple “Generate” interface alongside more...

Downloads: 8 This Week

Last Update: 2026-03-18
See Project
23

BettaFish

Public opinion analysis system

...Unlike simpler analytics tools, BettaFish employs agent collaboration and a “forum” style internal mechanism to combine diverse model outputs, making the analysis richer and more robust. It also integrates multimodal processing, enabling it to parse images and video alongside text.

Downloads: 1 This Week

Last Update: 2026-02-17
See Project
24

Qwen3-ASR

Qwen3-ASR is an open-source series of ASR models

Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware...

Downloads: 2 This Week

Last Update: 2026-02-09
See Project
25

GalTransl

Automated translation solution for visual novels

GalTransl is an automated translation system specifically designed for visual novels, particularly those in the “galgame” genre, leveraging large language models to streamline and enhance the translation process. It integrates support for multiple advanced LLM providers such as GPT-4, Claude, DeepSeek, and other models, enabling high-quality, context-aware translations that go beyond traditional machine translation approaches. The platform is built to handle the unique structure of visual...

Downloads: 2 This Week

Last Update: 4 days ago
See Project

Previous
You're on page 1
2
3
4
Next

Related Searches

ocr

umi-ocr

automatic1111

tts

umi

umi-ocr_paddle_v2.1.5.7z.exe

portable stable diffusion

whisper-windows-x64.exe

voice cloning

demucs

Related Categories

Artificial Intelligence

Software Development

Multimedia

Business

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise