Search Results for "text batch processing tools" - Page 18

Showing 447 open source projects for "text batch processing tools"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Guber, short for Gutenberg renamer, renames text files provided by the Gutenberg project into the format "Author, Title" by automatically extracting the relavent information from the text file. Guber can do single files or Batch processing of a directo
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    COOK is an embedded language which can be used as a macro preprocessor and for similar text processing. The concept is similar to PHP, but is oriented towards batch-mode processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    This project will compile a hungarian wordlist for use with spell-checkers like aspell. Additionally it will develop generic tools useful to compile and maintain wordlists for any language.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Collection of tools to convert data from Windows Treepad files to Unix Yank format and vice versa. The following tools are available: hjt2yank, yank2hjt, and hjtyank windows frontend.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 5
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Krypton Document Tools is a document manager that automatically encrypts all data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    text file to wav/mp3 converter (reader). use Microsoft Speech API compatible engines (not included). command line interface for batch processing. support dictionary for speech correction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    ZhDict provides command-line tools to aid English speakers in reading and understanding Chinese texts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    PyKit includes some small applications written in python, such as batch renaming, text processing and so on.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    Codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    zavod

    Zavod is a compilation of universal automated data processing tools.

    The zavod project can't be described with one word. It is a compilation of universal automated data processing tools. Which can be used with almost every data source and can generate output in almost every possible format. It can be used as Java-library, as a interactive GUI application, as a console application, as a web application or even to make websites with dynamic data from almost every data source.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    pdftools2

    Basic PDF tools.

    PdfTextExtractor - extract text from PDF document with text layer PdfShapeDrawer - draw basic shapes on PDF document Programs using iTextSharp library http://itextpdf.com/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    An experimental set of tools for text analysis and dictionary construction. One goal is to improve text-input e.g. on devices with touchscreens using dictionary-based symbolic on-screen keyboards.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HickDocBook is DocBook files converter tools set that can output HTML(multi and single), chm, pdf files. and what is more, it supports docbook files in Chinese character set.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    open-tamil

    Tamil Tools, Tamil Library for Python 2, 3

    Open-Tamil is a full featured Tamil text processing library in Python. It works fully in Python 2, 3. Published via pip - python package index. See: https://pypi.python.org/pypi/Open-Tamil/0.67
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AuthorWeb is an organization platform for writers of all kind which includes tools for creating and organizing Characters, Sets, Plots, Scenes, and other information for the craft. Written in Java, the executable jar can be run on any OS platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This project will provide tools for user to convert existing web sites, blogs and documents with non-standard Myanmar font data to Unicode 5.1 compatible data. (Zawgyi to Unicode 5.1, WinMyanmar system to Unicode 5.1 etc.)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    OCR Reader

    The tool supports template-based parsing, allowing structured output i

    OCR Reader is a lightweight Windows utility designed to extract text from PDF files and images using OCR (Tesseract engine). The tool supports template-based parsing, allowing structured output into CSV or TXT without manual coding. Core components Tesseract OCR engine Poppler (PDF rendering) Template-based extraction system Homepage: https://martan1484.github.io/OCR_Reader
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Maestro

    Maestro

    Offline AI orchestration with a modern UI & model integration

    LM-Kit Maestro is a powerful offline desktop application that lets you orchestrate AI agents directly on your local machine using a modern, clean interface. Built on the robust LM-Kit.NET framework with .NET MAUI and Razor, Maestro enables you to create personalized chatbots and conversational agents while ensuring your data remains secure with no external transfers. Evaluate each model’s performance based on your hardware and switch seamlessly between multiple models during a single...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A toolset with various tools for advanced text processing, image editing, converting multimedia formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    mms-300m-1130-forced-aligner

    mms-300m-1130-forced-aligner

    CTC-based forced aligner for audio-text in 158 languages

    mms-300m-1130-forced-aligner is a multilingual forced alignment model based on Meta’s MMS-300M wav2vec2 checkpoint, adapted for Hugging Face’s Transformers library. It supports forced alignment between audio and corresponding text across 158 languages, offering broad multilingual coverage. The model enables accurate word- or phoneme-level timestamping using Connectionist Temporal Classification (CTC) emissions. Unlike other tools, it provides significant memory efficiency compared to the TorchAudio forced alignment API. Users can integrate it easily through the Python package ctc-forced-aligner, and it supports GPU acceleration via PyTorch. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Bio_ClinicalBERT

    Bio_ClinicalBERT

    ClinicalBERT model trained on MIMIC notes for clinical NLP tasks

    Bio_ClinicalBERT is a domain-specific language model tailored for clinical natural language processing (NLP), extending BioBERT with additional training on clinical notes. It was initialized from BioBERT-Base v1.0 and further pre-trained on all clinical notes from the MIMIC-III database (~880M words), which includes ICU patient records. The training focused on improving performance in tasks like named entity recognition and natural language inference within the healthcare domain. Notes were...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB