Search Results for "text batch processing tools" - Page 8

Showing 446 open source projects for "text batch processing tools"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    EmotiVoice

    EmotiVoice

    Multi-Voice and Prompt-Controlled TTS Engine

    EmotiVoice is a multi-voice, prompt-controlled text-to-speech engine designed to generate highly expressive speech across thousands of voices. It supports both English and Chinese and ships with over 2,000 preset voices, making it suitable for everything from characters and virtual anchors to narration and dialogue. The core idea is prompt-based emotional and style control: you can ask the engine to speak “happy,” “sad,” “excited,” or with other high-level style prompts that shape prosody,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ddgr

    ddgr

    DuckDuckGo from the terminal

    ...It fetches search results via DuckDuckGo’s API or HTML output and presents links, snippets, and metadata in a clean terminal format, making it useful for programmers, sysadmins, and privacy advocates who prefer keyboard-driven workflows. The tool also supports options like opening a selected result in a web browser, piping results into other tools, and restricting searches to specific formats such as text-only or JSON for further processing. Because it avoids third-party tracking and ads built into many browser search experiences, ddgr appeals to users seeking greater control over data and a faster, distraction-free search flow.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 5
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    tabspace

    A Windows utility to batch-make program files tab-space compliant

    Different text readers/editors have different interpretations of Tab. A program that originally looked great in your editor would appear messy elsewhere. Therefore, many programmers tend to use space-only for guaranteed visual look, but that will cause a lot of waste of disk space. Is there an ideal solution? There IS! We can enforce a rule to all program files: In each line, before the first non-space, non-tab character, use TAB only! And after that character, use SPACE...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion dictionaries, stopwords). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    UniEM

    UniEM

    Unified embedding model

    UniEM is a unified embedding model designed to create high-quality text embeddings for various natural language processing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    Demucs (Deep Extractor for Music Sources) is a deep-learning framework for music source separation—extracting individual instrument or vocal tracks from a mixed audio file. The system is based on a U-Net-like convolutional architecture combined with recurrent and transformer elements to capture both short-term and long-term temporal structure. It processes raw waveforms directly rather than spectrograms, allowing for higher-quality reconstruction and fewer artifacts in separated tracks. The...
    Downloads: 63 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    TextSeek

    TextSeek

    Professional full-text desktop search tool

    TextSeek is a professional full-text desktop search tool. Unlike the filename search tool like Everything and Listary, TextSeek can search filename and file content easily and quickly. It supports PDF, Word, Excel, Powerpoint, RTF and other formats. The software can run directly, and no extra package is required to install.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Rome formatter

    Rome formatter

    Unified developer tools for JavaScript, TypeScript, and the web

    Rome is a formatter, linter, bundler, and more for JavaScript, TypeScript, JSON, HTML, Markdown, and CSS. Rome is designed to replace Babel, ESLint, webpack, Prettier, Jest, and others. Rome unifies functionality that has previously been separate tools. Building upon a shared base allows us to provide a cohesive experience for processing code, displaying errors, parallelizing work, caching, and configuration. Rome has strong conventions and aims to have minimal configuration. Read more about our project philosophy. Rome is written in Rust. Rome has first-class IDE support, with a sophisticated parser that represents the source text in full fidelity and top-notch error recovery. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    The Art of Command Line

    The Art of Command Line

    Master the command line, in one page

    The Art of Command Line is a single, highly curated page of tips that distills years of Unix command-line experience into practical, memorable guidance. It emphasizes fluency: small habits and commands that compound into faster debugging, data wrangling, and system navigation. The content spans basic shell usage, text processing with tools like grep/sed/awk, networking and performance inspection, and advice for working safely with root and destructive commands. Many entries highlight lesser-known flags or idioms that save keystrokes or avoid pitfalls, and the list aims to be dense but scannable. It is written for Linux first while acknowledging macOS and Windows differences where relevant. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    Botpress

    Botpress

    Dev tools to reliably understand text and automate conversations

    We make building chatbots much easier for developers. We have put together the boilerplate code and infrastructure you need to get a chatbot up and running. We propose you a complete dev-friendly platform that ships with all the tools you need to build, deploy and manage production-grade chatbots in record time. Built-in Natural Language Processing tasks such as intent recognition, spell checking, entity extraction, and slot tagging (and many others). A visual conversation studio to design...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 15
    Promptify

    Promptify

    se GPT or other prompt based models to get structured output

    Promptify is an open-source Python library designed to simplify prompt engineering and the development of natural language processing pipelines using large language models. The project provides tools that help developers generate structured prompts for different NLP tasks and apply them across multiple generative AI systems. Instead of manually crafting prompts for each task, Promptify introduces a unified architecture that combines prompt templates, language model interfaces, and processing pipelines into a single framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AudioEqualizer
    Introducing AudioEqualize: Elevate Your Audio Experience! AudioEqualize isn't just your average volume adjustment tool; it's a sophisticated audio wizard that goes beyond simple peak amplitude normalization. Designed to enhance your music library, AudioEqualize meticulously analyzes and precisely tunes your MP3 files to a target volume of your choice. Here's why it's the ultimate choice for audio enthusiasts:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    ChatGPT Advanced

    ChatGPT Advanced

    Browser extension adding web search results to ChatGPT prompts easily

    chatgpt-advanced, commonly known as WebChatGPT, is an open source browser extension designed to enhance the capabilities of ChatGPT by integrating real-time web search results into user prompts. It works by intercepting queries submitted to the ChatGPT interface and optionally augmenting them with information gathered from search engines before sending the prompt to the chatbot. This approach allows the model to generate responses that are more current and contextually relevant compared to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    EBookGenTools

    EBookGenTools

    EBook Generation Tools - scripts to create ebook formats EPUB, DOC

    EBookGenTools is a set of GNU/Linux shell scripts to process plain text for a book into HTML and electronic book formats. It was developed to create EPUB and DOC files from book text exported from novel writing software such as Manuskript, StoryBook, or your favourite text editor. EBookGenTools builds on the power of other software to create the following ebook formats: - EPUB: Calibre - ebook management - DOC: LibreOffice - free office suite These tools can be used directly to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    FFCreator

    FFCreator

    A fast video processing library based on node.js

    ...Based on pictures and Text content, dynamic batch generation of short videos is a technical problem. FFCreator is a lightweight and flexible solution that requires few dependencies and low machine configuration to start working quickly. It simulates 90% animation effects of animate.css. You can easily convert the animation effects on the web page side into videos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc...
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Batch Insert License

    Batch Insert License

    Prepend a copyright/license declaration to many files at once.

    Batch Insert License is used to prepend a block of text, describing the applicable software license, to many source files automatically. It can be used to: - Add a specified license comment block to the top of each source file in your project. - Replace an existing license comment block with a specified new license comment block in each source file. - Delete an existing license comment block from each source file. It is written in Java and will run on any OS that has Java is installed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Riffusion

    Riffusion

    Real-time music generation using stable diffusion techniques AI

    ...It implements a diffusion pipeline that supports prompt interpolation, allowing smooth transitions between different musical styles or prompts over time. Riffusion (hobby) serves as the core implementation for audio and image processing, providing essential building blocks for generating music from text prompts. It includes both developer-oriented tools and user-facing components such as a command-line interface and an interactive Streamlit application for experimentation. Additionally, it can run as a Flask server to expose model inference through an API, enabling integration with other applications or services.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Automatic YouTube subtitle generation

    Automatic YouTube subtitle generation

    Using OpenAI's Whisper to automatically generate YouTube subtitles

    ...It allows users to download videos or audio from YouTube and automatically generate subtitles or transcripts. The tool processes media locally, extracting audio and applying speech recognition to produce accurate text outputs. It supports multiple languages and can handle different Whisper model sizes, balancing performance and accuracy. yt-whisperc is designed for automation, enabling batch processing of multiple videos for transcription workflows. It also provides options for exporting subtitles in common formats such as SRT. Overall, it simplifies the process of converting video content into searchable and accessible text.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB