Search Results for "text batch processing tools"

Showing 122 open source projects for "text batch processing tools"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    text-extract-api

    text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API

    ...Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction capabilities into a unified API that standardizes the output. The platform supports automated processing pipelines that detect file types and apply the appropriate extraction method to obtain the most accurate text representation possible. It can be integrated into document analysis systems, knowledge retrieval tools, and AI pipelines that rely on clean textual data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Chandra

    Chandra

    OCR model for complex documents with layout-aware structured outputs

    ...Chandra can be run locally using transformer-based inference or deployed with a high-performance server setup for large-scale processing. It also includes command-line tools and optional web-based interfaces to simplify interaction and batch processing workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    ...This AI-assisted correction process helps reconstruct missing characters, fix formatting mistakes, and produce more coherent text outputs. The project is particularly useful for digitizing historical documents, research papers, and scanned materials where traditional OCR often struggles. It also includes tools for processing batches of images or documents, enabling automated document digitization workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    OpenMed

    OpenMed

    Open source healthcare AI

    ...OpenMed can be used in three main ways: as a simple Python API for scripts and notebooks, as a Docker-friendly FastAPI service for backend integration, and as a batch-processing system for multi-document workflows.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 6
    deepdoctection

    deepdoctection

    A Repo For Document AI

    DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Zerox OCR

    Zerox OCR

    PDF to Markdown with vision models

    A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. ...
    Downloads: 74 This Week
    Last Update:
    See Project
  • 9
    AUTOMATIC1111 Stable Diffusion web UI
    ...The interface also supports prompt editing, batch processing, custom scripts, and many community extensions, making it a highly customizable and continually evolving platform for creative AI art generation.
    Downloads: 173 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    Pathway

    Pathway

    Python ETL framework for stream processing, real-time analytics, LLM

    ...Pathway is especially well-suited for scenarios like financial analytics, IoT, fraud detection, and logistics, where high-velocity and continuously changing data is the norm. Unlike traditional batch processing frameworks, Pathway continuously updates the results of your data logic as new events arrive, functioning more like a database that reacts in real-time. It supports Python, integrates with modern data tools, and offers a deterministic dataflow model to ensure reproducibility and correctness.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    nunif

    nunif

    Misc; latest version of waifu2x; 2D video to stereo 3D video

    nunif is a deep learning–based image processing framework focused on image upscaling, restoration, denoising, and enhancement tasks using neural network models. The project provides a collection of AI-powered utilities designed primarily for anime-style artwork, illustrations, and high-quality image restoration workflows. It includes command-line tools and graphical interfaces for applying trained neural models to improve image resolution and visual clarity while minimizing artifacts. nunif supports GPU acceleration and batch processing, making it suitable for creators, archivists, and enthusiasts handling large image collections. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Faster Whisper

    Faster Whisper

    Faster Whisper transcription with CTranslate2

    Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical. It supports multiple model sizes, allowing users to balance speed and accuracy based...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 13
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 14
    SciSpaCy

    SciSpaCy

    A full spaCy pipeline and models for scientific/biomedical documents

    ScispaCy is a spaCy extension optimized for processing biomedical and scientific text, providing domain-specific NLP models for tasks like named entity recognition (NER) and dependency parsing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Hazm

    Hazm

    Persian NLP Toolkit

    Hazm is a natural language processing (NLP) library for Persian text, offering various tools for text preprocessing, tokenization, part-of-speech tagging, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AutoSubSync

    AutoSubSync

    Automatic subtitle synchronization tool

    ...AutoSubSync also includes batch processing capabilities, enabling users to handle entire media libraries efficiently. It supports a wide range of subtitle formats and can synchronize subtitles using either the original video or a reference subtitle file. Overall, it streamlines subtitle correction workflows while maintaining flexibility and precision.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 17
    Hugging Face - Speech To Speech

    Hugging Face - Speech To Speech

    Open speech-to-speech models and pipelines by Hugging Face toolkit AI

    This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Sygil WebUI

    Sygil WebUI

    Stable Diffusion web UI

    Sygil WebUI is a browser-based interface for running Stable Diffusion image generation locally or on a server, wrapping common text-to-image and image-to-image workflows into a practical UI. It provides multiple UI modes (including a legacy Gradio interface) and focuses on making iterative prompting, parameter tuning, and post-processing accessible without writing code. The UI exposes core generation controls like resolution, CFG guidance, sampling steps, samplers, seeds, and batch generation so users can reproduce results and refine outputs systematically. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    ...The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 21
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    AutoCut

    AutoCut

    Cut videos with a text editor

    ...AutoCut supports multiple transcription backends, including Whisper and faster-whisper modes, allowing users to choose based on speed or accuracy needs. After editing the transcript text, the corresponding video clips are merged into the final output, and the tool also produces matching subtitle files. Its command-line interface can be integrated into scripts, making it suitable for automated workflows or batch processing.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    edge-tts

    edge-tts

    Use Microsoft Edge's online text-to-speech service from Python

    edge-tts is a Python module and command-line tool that gives you direct access to Microsoft Edge’s online text-to-speech service without needing the Edge browser, Windows, or any API key. It wraps the same cloud voices used by Edge, exposing them through a simple CLI (edge-tts, edge-playback) and a Python API, so you can script high-quality speech generation in your own applications. The tool lets you list available voices, specify locale and voice name, and generate audio files in common...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 24
    Verticals v3

    Verticals v3

    Automated YouTube Shorts pipeline

    Verticals v3 is an automated content generation workflow designed to create and process YouTube Shorts videos programmatically. It combines multiple tools and scripts to handle tasks such as downloading source material, editing clips, adding subtitles, and formatting output for vertical video platforms. The pipeline emphasizes automation, allowing users to produce short-form content at scale with minimal manual intervention. It integrates FFmpeg and other media processing tools to handle...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    ...Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB