Page 2 | text batch processing tools free download

LLM TLDR

95% token savings. 155x faster queries. 16 languages

...To enhance usability, LLM-TLDR includes command-line tools and integration examples for common workflows like batch summarization, webhook ingestion, and automation in documentation pipelines.

Downloads: 0 This Week

Last Update: 2026-01-27

See Project

ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while...

Downloads: 0 This Week

Last Update: 2026-04-22

See Project

AudioCraft

Audiocraft is a library for audio processing and generation

...It also contains training code and recipes, so researchers can fine-tune on custom data or explore new objectives without building infrastructure from scratch. Example notebooks, CLI tools, and audio utilities help with prompt design, conditioning on reference audio, and post-processing to produce ready-to-share outputs.

Downloads: 11 This Week

Last Update: 2025-10-13

See Project

NLP

Open source NLP guide with models, methods, and real use cases

NLP is an open source introductory resource for natural language processing, presented as a continuously updated book hosted on GitHub. It explains how machines process and understand human language, combining theory with practical examples. Its covers core NLP concepts such as text representation, feature extraction, and model evaluation, alongside hands-on implementations using tools like Word2Vec, TF-IDF, and FastText.

Downloads: 2 This Week

Last Update: 4 days ago

See Project

Stanza

Stanford NLP Python library for many human languages

Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...

Downloads: 0 This Week

Last Update: 2026-05-14

See Project

FireRed-Image-Edit

General-purpose image editing model that delivers high-fidelity

FireRed-Image-Edit is an open-source general-purpose image editing model and toolset designed to deliver high-fidelity, visually coherent edits across a wide range of editing tasks, from simple object modifications to complex enhancements like restoration and style preservation. It is built on a flexible text-to-image foundation model that has been extended with training paradigms including pretraining, supervised fine-tuning, and reinforcement learning to imbue the system with strong...

Downloads: 3 This Week

Last Update: 2026-04-03

See Project

Sparrow

Structured data extraction and instruction calling with ML, LLM

Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to...

Downloads: 0 This Week

Last Update: 2026-03-04

See Project

NarratoAI

Using AI models to automatically provide commentary and edit videos

NarratoAI is an open-source platform designed to automate the generation of narrative content using artificial intelligence. The system combines large language models with media processing capabilities to create scripts, stories, and structured narrative outputs from user inputs. NarratoAI supports workflows where users provide prompts, themes, or source materials, and the software organizes them into coherent narrative structures suitable for articles, scripts, or multimedia storytelling. The project integrates multiple AI components such as text generation models, content structuring pipelines, and automated editing tools to streamline content creation. ...

Downloads: 2 This Week

Last Update: 2026-04-27

See Project

LiveKit Agents

Framework for building realtime multimodal voice AI agents apps

LiveKit Agents is an open source framework designed for building realtime AI agents that can participate as programmable entities within communication sessions. It enables developers to create conversational and multimodal agents capable of processing voice, audio, and other inputs in realtime environments. These agents can join LiveKit rooms as participants and interact with users or systems through speech, text, and other modalities. LiveKit Agents provides libraries and tooling that allow developers to combine speech-to-text, large language models, and text-to-speech services to build interactive AI experiences. ...

Downloads: 1 This Week

Last Update: 5 days ago

See Project

FlexLLMGen

Running large language models on a single GPU

FlexLLMGen is an open-source inference engine designed to run large language models efficiently on limited hardware resources such as a single GPU. The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on...

Downloads: 0 This Week

Last Update: 2026-03-10

See Project

Pipecat

Framework for building real-time voice and multimodal AI agents

Pipecat is an open source Python framework designed for building real-time voice and multimodal conversational AI agents. It provides developers with tools to orchestrate complex pipelines that combine speech recognition, language models, audio processing, and speech synthesis into a cohesive conversational system. Pipecat focuses on low-latency interactions so voice conversations with AI feel natural and responsive during live use. Pipecat allows applications to integrate multiple AI services and transports, enabling flexible deployment across different environments and communication channels. ...

Downloads: 0 This Week

Last Update: 2026-05-16

See Project

TADA

Open Source Speech Language Model

TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...

Downloads: 0 This Week

Last Update: 2026-03-24

See Project

Advanced AI explainability for PyTorch

Advanced AI Explainability for computer vision

pytorch-grad-cam is an open-source library that provides advanced explainable AI techniques for interpreting the predictions of deep learning models used in computer vision. The project implements Grad-CAM and several related visualization methods that highlight the regions of an image that most strongly influence a neural network’s decision. These visualization techniques allow developers and researchers to better understand how convolutional neural networks and transformer-based vision...

Downloads: 0 This Week

Last Update: 5 days ago

See Project

IMS Toucan

Controllable and fast Text-to-Speech for over 7000 languages

IMS-Toucan is a toolkit for training, using, and teaching state-of-the-art text-to-speech systems, built at the Institute for Natural Language Processing (IMS), University of Stuttgart. It is the official home of ToucanTTS, a massively multilingual TTS system designed to support over 7,000 languages with a single unified framework. The toolkit focuses on being fast and controllable while not requiring huge amounts of compute, making it practical for research labs and smaller teams. ...

Downloads: 0 This Week

Last Update: 2025-11-28

See Project

Trae Agent

LLM-based agent for general purpose software engineering tasks

Trae Agent is an open-source, LLM-based agent system also developed by ByteDance, focused primarily on automating software engineering workflows. It provides a command-line interface (CLI) that accepts natural-language instructions (e.g. “refactor this module,” “write a unit test,” “generate a REST API skeleton”), and then orchestrates tool-based workflows — such as file editing, shell/batch commands, code generation, code formatting or refactoring — to carry out complex engineering tasks....

Downloads: 1 This Week

Last Update: 2026-02-05

See Project

HivisionIDPhoto

HivisionIDPhotos: a lightweight and efficient AI ID photos tools

HivisionIDPhotos is an open-source AI project designed to automatically generate professional ID photographs from ordinary portrait images. The system uses computer vision and machine learning models to detect faces, segment the subject from the background, and produce standardized identification photos suitable for official documents. It is designed as a lightweight tool that can perform inference offline and run efficiently on CPUs without requiring powerful GPUs. The software analyzes...

Downloads: 1 This Week

Last Update: 2026-03-10

See Project

video-use

Edit videos with Claude Code

Video Use is an open-source AI-powered video editing tool that allows users to transform raw footage into polished videos using natural language commands. Designed to work with Claude Code, it automates the entire editing process—from cutting clips to rendering the final output—without requiring manual timelines or complex software interfaces. The system intelligently analyzes audio transcripts and visual cues to make precise, context-aware editing decisions. It supports a wide range of...

Downloads: 24 This Week

Last Update: 2026-05-15

See Project

HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

HunyuanVideo is a cutting-edge framework designed for large-scale video generation, leveraging advanced AI techniques to synthesize videos from various inputs. It is implemented in PyTorch, providing pre-trained model weights and inference code for efficient deployment. The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content. Release of FP8 model weights to reduce GPU...

1 Review

Downloads: 4 This Week

Last Update: 2025-09-23

See Project

HunyuanOCR

OCR expert VLM powered by Hunyuan's native multimodal architecture

HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...

Downloads: 1 This Week

Last Update: 2026-05-11

See Project

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...

Downloads: 0 This Week

Last Update: 2026-01-27

See Project

Streamer-Sales

LLM Large Model of Selling Anchor

Streamer-Sales is an open-source large language model system designed specifically for e-commerce live streaming and automated product promotion. The project focuses on generating persuasive product descriptions and live presentation scripts that mimic the style of professional online sales hosts. By analyzing product characteristics and marketing information, the model can produce engaging explanations that emphasize benefits, features, and emotional appeal to encourage viewers to make...

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

Open-LLM-VTuber

Open source AI VTuber platform with voice chat and Live2D avatars

Open-LLM-VTuber is an open source platform designed to create AI-powered VTuber characters that can interact with users through voice and animated avatars. It enables hands-free conversations with large language models by combining speech recognition, language processing, and text-to-speech synthesis into a single system. Users can speak directly to the AI character, and the system can respond with a generated voice while animating a Live2D avatar to simulate a talking virtual personality. Open-LLM-VTuber is modular, allowing developers to swap or configure different language models, speech recognition engines, and voice synthesis systems depending on their needs. ...

Downloads: 2 This Week

Last Update: 2026-03-17

See Project

Conversational Health Agents (CHA)

A Personalized LLM-powered Agent Frameworks

CHA, or Conversational Health Agents, is an open-source framework designed to build intelligent healthcare assistants powered by large language models and external data sources. The system enables developers to create personalized AI agents that can interact with users through natural language while performing multi-step reasoning and task execution. It integrates orchestration capabilities that allow the agent to gather information from APIs, knowledge bases, and external services in order...

Downloads: 0 This Week

Last Update: 2026-03-17

See Project

DreamCraft3D

Official implementation of DreamCraft3D

DreamCraft3D is DeepSeek’s generative 3D modeling framework / model family that likely extends their earlier 3D efforts (e.g. Shap-E or Point-E style models) with more capability, control, or expression. The name suggests a “dream crafting” metaphor—users probably supply textual or image prompts and generate 3D assets (point clouds, meshes, scenes). The repository includes model code, inference scripts, sample prompts, and possibly dataset preparation pipelines. It may integrate rendering or...

Downloads: 0 This Week

Last Update: 2025-10-03

See Project

NVIDIA Generative AI Examples

Generative AI reference workflows

NVIDIA GenerativeAIExamples is an open-source repository that provides practical reference implementations and example workflows for building generative AI applications using NVIDIA’s software ecosystem. The project is designed to help developers accelerate the development of AI applications by providing ready-to-run pipelines, notebooks, and tools that demonstrate how to integrate large language models into real-world systems. The repository includes examples covering topics such as retrieval-augmented generation pipelines, agent-based workflows, and multimodal AI applications that combine text, vision, and data processing. Many of the examples show how to deploy AI services using containerized environments, GPU acceleration, and microservices that can scale across modern infrastructure. ...

Downloads: 0 This Week

Last Update: 2026-03-05

See Project

Search Results for "text batch processing tools" - Page 2

Showing 78 open source projects for "text batch processing tools"

LLM TLDR

ESPnet

AudioCraft

NLP

Stanza

FireRed-Image-Edit

Sparrow

NarratoAI

LiveKit Agents

FlexLLMGen

Pipecat

TADA

Advanced AI explainability for PyTorch

IMS Toucan

Trae Agent

HivisionIDPhoto

video-use

HunyuanVideo

HunyuanOCR

Kimi-Audio

Streamer-Sales

Open-LLM-VTuber

Conversational Health Agents (CHA)

DreamCraft3D

NVIDIA Generative AI Examples

Search Results for "text batch processing tools" - Page 2

Showing 78 open source projects for "text batch processing tools"

LLM TLDR

ESPnet

AudioCraft

NLP

Stanza

FireRed-Image-Edit

Sparrow

NarratoAI

LiveKit Agents

FlexLLMGen

Pipecat

TADA

Advanced AI explainability for PyTorch

IMS Toucan

Trae Agent

HivisionIDPhoto

video-use

HunyuanVideo

HunyuanOCR

Kimi-Audio

Streamer-Sales

Open-LLM-VTuber

Conversational Health Agents (CHA)

DreamCraft3D

NVIDIA Generative AI Examples

Related Searches

Related Categories