Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "text batch processing tools"

x

Sort By:

Relevance

Clear All Filters

OS

ChromeOS 62
BSD 62
Linux 62
More...
Mac 62
Windows 62
Desktop Operating Systems 1
Mobile Operating Systems 1

Category

Artificial Intelligence 62
Software Development 6
Scientific/Engineering 4
Business 2
Multimedia 2
Text Editors 2
Communications 1
Education 1
Internet 1

License

OSI-Approved Open Source 54
Creative Commons Attribution License 3
Other License 1

Translations

English 5
Tamil 1

Programming Language

Python 39
Java 9
JavaScript 7
TypeScript 7
More...
C 2
Unix Shell 2
C++ 1
C# 1
Go 1
JSP 1
Julia 1
Perl 1
Rust 1

Status

Beta 5
Production/Stable 4
Alpha 3
Pre-Alpha 1

62 projects for "text batch processing tools" with 2 filters applied:

Artificial Intelligence ChromeOS Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

Chandra

OCR model for complex documents with layout-aware structured outputs

...Chandra can be run locally using transformer-based inference or deployed with a high-performance server setup for large-scale processing. It also includes command-line tools and optional web-based interfaces to simplify interaction and batch processing workflows.

Downloads: 3 This Week

Last Update: 2026-03-18
See Project
2

text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API

...Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction capabilities into a unified API that standardizes the output. The platform supports automated processing pipelines that detect file types and apply the appropriate extraction method to obtain the most accurate text representation possible. It can be integrated into document analysis systems, knowledge retrieval tools, and AI pipelines that rely on clean textual data. ...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
3

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...

Downloads: 5 This Week

Last Update: 2026-02-03
See Project
4

Hugging Face - Speech To Speech

Open speech-to-speech models and pipelines by Hugging Face toolkit AI

This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...

Downloads: 8 This Week

Last Update: 2026-01-27
See Project
6

edge-tts

Use Microsoft Edge's online text-to-speech service from Python

edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common...

Downloads: 35 This Week

Last Update: 2026-03-22
See Project
7

Short Video Factory

AI tool for automatic batch short video creation and editing

Short Video Factory is an open source desktop application designed to simplify the creation of short-form videos using AI-driven automation. It enables users to generate product marketing clips and general content videos by combining simple prompt-based input with pre-prepared media assets. Short Video Factory integrates multiple stages of video production, including script generation, voice synthesis, video editing, and subtitle effects, into a single streamlined workflow. By leveraging AI...

Downloads: 1 This Week

Last Update: 2026-04-07
See Project
8

Pocket TTS

A TTS that fits in your CPU (and pocket)

...Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.

Downloads: 12 This Week

Last Update: 2026-05-04
See Project
9

SwarmUI

Modular AI image and video generation web UI with extensible tools

SwarmUI is a modular web-based user interface designed for AI-driven image generation, with a strong focus on usability, performance, and extensibility. It serves as a unified environment for working with multiple AI models, including Stable Diffusion and newer image and video generation systems, allowing users to create and manage outputs through a browser interface. SwarmUI is built to accommodate both beginners and advanced users by offering a simple “Generate” interface alongside more...

Downloads: 8 This Week

Last Update: 2026-03-18
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

AI App Lab

Implementing large models into scenario-based applications

AI App Lab is an open-source platform developed by Volcengine that provides tools, SDKs, and example applications for building real-world AI applications powered by large language models. The project focuses on helping developers bridge the gap between AI models and practical business use cases by offering a structured environment for creating production-ready AI systems. It includes a high-level SDK called Arkitect, which provides workflows and tools for integrating models, plugins, and multimodal capabilities such as text, image, and voice processing.

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
11

Qwen3-ASR

Qwen3-ASR is an open-source series of ASR models

Qwen3-ASR is an automatic speech recognition system in the QwenLM family, developed to convert spoken language into text with strong accuracy and real-time performance. As a specialized ASR variant of the broader Qwen language model ecosystem, it focuses on capturing reliable transcriptions from audio sources such as recordings, live streams, or conversational inputs while supporting low latency use cases. The architecture combines advanced neural acoustic modeling with context-aware...

Downloads: 2 This Week

Last Update: 2026-02-09
See Project
12

GalTransl

Automated translation solution for visual novels

GalTransl is an automated translation system specifically designed for visual novels, particularly those in the “galgame” genre, leveraging large language models to streamline and enhance the translation process. It integrates support for multiple advanced LLM providers such as GPT-4, Claude, DeepSeek, and other models, enabling high-quality, context-aware translations that go beyond traditional machine translation approaches. The platform is built to handle the unique structure of visual...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
13

AudioCraft

Audiocraft is a library for audio processing and generation

...It also contains training code and recipes, so researchers can fine-tune on custom data or explore new objectives without building infrastructure from scratch. Example notebooks, CLI tools, and audio utilities help with prompt design, conditioning on reference audio, and post-processing to produce ready-to-share outputs.

Downloads: 11 This Week

Last Update: 2025-10-13
See Project
14

StoryGen Atelier

AI-assisted storyboard and video generation tool

StoryGen Atelier is an advanced creative tool that blends AI with visual storytelling, making it possible to generate fully structured storyboards and stitched videos from text prompts without requiring manual art or animation skills. Users begin with natural language descriptions of their story or scene, and the system uses state-of-the-art large models to generate both the script and corresponding frames. Once individual frames are created, a second AI model generates transition clips that smoothly link the frames into a coherent short video sequence, and the tool then assembles everything into a finished video using standard video processing tools.

Downloads: 3 This Week

Last Update: 2026-02-04
See Project
15

LLM TLDR

95% token savings. 155x faster queries. 16 languages

...To enhance usability, LLM-TLDR includes command-line tools and integration examples for common workflows like batch summarization, webhook ingestion, and automation in documentation pipelines.

Downloads: 0 This Week

Last Update: 2026-01-27
See Project
16

ESPnet

End-to-end speech processing toolkit

ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while...

Downloads: 0 This Week

Last Update: 2026-04-22
See Project
17

NLP

Open source NLP guide with models, methods, and real use cases

NLP is an open source introductory resource for natural language processing, presented as a continuously updated book hosted on GitHub. It explains how machines process and understand human language, combining theory with practical examples. Its covers core NLP concepts such as text representation, feature extraction, and model evaluation, alongside hands-on implementations using tools like Word2Vec, TF-IDF, and FastText.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
18

SemTools

Semantic search and document parsing tools for the command line

SemTools is an open-source command-line toolkit designed for document parsing, semantic indexing, and semantic search workflows. The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead. SemTools can parse documents,...

Downloads: 2 This Week

Last Update: 2026-03-13
See Project
19

LiveKit Agents

Framework for building realtime multimodal voice AI agents apps

LiveKit Agents is an open source framework designed for building realtime AI agents that can participate as programmable entities within communication sessions. It enables developers to create conversational and multimodal agents capable of processing voice, audio, and other inputs in realtime environments. These agents can join LiveKit rooms as participants and interact with users or systems through speech, text, and other modalities. LiveKit Agents provides libraries and tooling that allow developers to combine speech-to-text, large language models, and text-to-speech services to build interactive AI experiences. ...

Downloads: 2 This Week

Last Update: 4 days ago
See Project
20

FireRed-Image-Edit

General-purpose image editing model that delivers high-fidelity

FireRed-Image-Edit is an open-source general-purpose image editing model and toolset designed to deliver high-fidelity, visually coherent edits across a wide range of editing tasks, from simple object modifications to complex enhancements like restoration and style preservation. It is built on a flexible text-to-image foundation model that has been extended with training paradigms including pretraining, supervised fine-tuning, and reinforcement learning to imbue the system with strong...

Downloads: 2 This Week

Last Update: 2026-04-03
See Project
21

Sora.FM

Sora AI Video Generator by Sora.FM

Sora.FM is positioned as a tool in the AI-generated video domain — likely aiming to let users produce video content via AI-driven workflows rather than classic manual editing. The project belongs to the growing class of “AI video generator / AI-assisted content creation” tools: it may use model-based generation, template-based editing, or combine video assets with generative models to automate parts of video creation or editing. For creators wanting to explore AI-based content generation —...

Downloads: 2 This Week

Last Update: 2025-12-08
See Project
22

Pipecat

Framework for building real-time voice and multimodal AI agents

Pipecat is an open source Python framework designed for building real-time voice and multimodal conversational AI agents. It provides developers with tools to orchestrate complex pipelines that combine speech recognition, language models, audio processing, and speech synthesis into a cohesive conversational system. Pipecat focuses on low-latency interactions so voice conversations with AI feel natural and responsive during live use. Pipecat allows applications to integrate multiple AI services and transports, enabling flexible deployment across different environments and communication channels. ...

Downloads: 0 This Week

Last Update: 2026-05-16
See Project
23

FlexLLMGen

Running large language models on a single GPU

FlexLLMGen is an open-source inference engine designed to run large language models efficiently on limited hardware resources such as a single GPU. The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
24

FastRTC

The python library for real-time communication

...This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. FastRTC also integrates nicely with UI frameworks (e.g. via a web demo using Gradio), so developers can rapidly prototype and deploy real-time streaming applications without deep knowledge of low-level WebRTC internals. Because voice-enabled AI agents often involve many moving parts (speech-to-text, text processing, text-to-speech, streaming, session/chat management), FastRTC helps by handling the streaming aspect, leaving the rest to be plugged in modularly.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
25

TADA

Open Source Speech Language Model

TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project

Previous
You're on page 1
2
3
Next

Related Searches

tts

ocr

srt to speech

image to video

ai video generator

tesseract-ocr-w64-setup.exe

tesseract-ocr-w64-setup-5.5.0.20241111.exe

scan

portable subtitle downloader

flow

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Business

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise