Search Results for "text processing" - Page 6

Showing 346 open source projects for "text processing"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1

    Eng2BN CSV Translator

    Translate English to Bangla using CSV file format and range wise.

    Eng2BN CSV Translator user-friendly Python tool that enables efficient translation of English text to Bangla within CSV files. The application supports large datasets and allows users to translate specific row ranges, making it ideal for batch processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Free AI Watermark Remover - FreeRepair

    Free AI Watermark Remover - FreeRepair

    AI-powered tool to quickly remove watermarks from images flawlessly

    AI Watermark Remover (Free And Open-Source) & Make Blurry Images Clearer Or Larger Tool - FreeRepair, Simulation IOPaint Based On The Django Of Python With No Sign-Up. As a free, open-source, AI-powered tool, FreeRepair makes it easy to remove watermarks, logos, text or clutter from images, and blurry images can be made clearer or larger. No installation, no internet connection, it works out of the box, safe and secure, unlimited.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 3
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    AudioBC

    AudioBC

    Offline desktop app to convert EPUB to MP3 using Kokoro-82M neural TTS

    ...Key Features: Neural Quality TTS: Uses the compact yet powerful Kokoro-82M model for high-fidelity, expressive voice synthesis. Privacy-First & Offline: After a one-time initial model download, all processing happens on your CPU. Your books never leave your computer. Multi-Language Support: Curated voices for English (US & UK), Italian, French, Spanish, and Portuguese (BR). Smart Extraction: Automatically filters out non-narrative cont
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    BWR Ai watermark remover

    BWR Ai watermark remover

    AI-powered tool to quickly remove watermarks from videos and photo

    Blue Wave Remover is an advanced AI-driven video watermark removal software designed to effortlessly eliminate logos, text, timestamps, and watermarks from video content. Utilizing cutting-edge computer vision and generative AI algorithms, it accurately detects and removes both static and moving watermarks while preserving the original video's quality, colors, and clarity. The program supports popular video formats and offers batch processing for fast and efficient removal on multiple files. ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    LexiFinder

    LexiFinder

    AI-powered semantic indexing: automating the creation of book indexes

    ...Given one or more source documents and a set of keywords, it extracts all nouns, compares them semantically to the keywords using a pretrained NLP model, and produces a structured, hierarchical index ready to be included in a book or manuscript. LexiFinder works in two ways: as a command-line tool for scripting, automation, and batch processing, and as a graphical application for a guided, point-and-click experience. Both interfaces share the same underlying engine and support the same features. Supported input formats are PDF, DOCX, and ODT. The index can be exported as plain text, JSON, CSV, or HTML.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Advanced Trigonometry Calculator

    Advanced Trigonometry Calculator

    Precision Trigonometry: Advanced Calculator for Complex Math

    Advanced Trigonometry Calculator is equipped with a user-friendly interface that allows for easy input of problems and instant computation. Professionals such as engineers who need to perform advanced trigonometric calculations in their work will find this tool extremely useful. ATC Online Alpha: https://advantrigoncalc.sourceforge.io/atc/ More info by clicking below: https://advantrigoncalc.sourceforge.io/ Advanced Trigonometry Calculator was only and always only developed by...
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • 8
    File Sorter for Photographers

    File Sorter for Photographers

    Organize files/images from a csv or xlsx file.

    A user-friendly application to efficiently sort all types of files from a source folder into a destination folder based on a list of filenames provided in an Excel or CSV file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Obsei

    Obsei

    Obsei is a low code AI powered automation tool

    Obsei is an automated no-code/low-code AI-powered text observation and analysis framework, designed for extracting insights from unstructured text data such as social media, reviews, and logs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Wikipedia2Vec

    Wikipedia2Vec

    A tool for learning vector representations of words and entities

    Wikipedia2Vec is an embedding learning tool that creates word and entity vector representations from Wikipedia, enabling NLP models to leverage structured and contextual knowledge.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Transformers4Rec

    Transformers4Rec

    Transformers4Rec is a flexible and efficient library

    Transformers4Rec is an advanced recommendation system library that leverages Transformer models for sequential and session-based recommendations. The library works as a bridge between natural language processing (NLP) and recommender systems (RecSys) by integrating with one of the most popular NLP frameworks, Hugging Face Transformers (HF). Transformers4Rec makes state-of-the-art transformer architectures available for RecSys researchers and industry practitioners. Traditional recommendation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    YAYI

    YAYI

    Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM

    YAYI is an open-source large language model project developed to provide a multilingual conversational AI system capable of performing a wide variety of natural language processing tasks. The model is trained on diverse datasets covering multiple languages and domains so that it can support applications ranging from dialogue systems to text analysis and knowledge retrieval. The architecture is based on transformer-style language models optimized for conversational understanding and generation. In addition to producing coherent responses, the system is designed to handle tasks such as summarization, translation, question answering, and text classification. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    GPT-2 Output Dataset

    GPT-2 Output Dataset

    Dataset of GPT-2 outputs for research in detection, biases, and more

    The GPT-2 Output Dataset is a large collection of model-generated text, released by OpenAI alongside the GPT-2 research paper to study the behaviors and limitations of large language models. It contains 250,000 samples of GPT-2 outputs, generated with different sampling strategies such as top-k truncation, to highlight the diversity and quality of model completions. The dataset also includes corresponding human-written text for comparison, enabling researchers to explore methods for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ddgr

    ddgr

    DuckDuckGo from the terminal

    ...The tool also supports options like opening a selected result in a web browser, piping results into other tools, and restricting searches to specific formats such as text-only or JSON for further processing. Because it avoids third-party tracking and ads built into many browser search experiences, ddgr appeals to users seeking greater control over data and a faster, distraction-free search flow.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MahaKurawa.My.ID URL Extractor

    MahaKurawa.My.ID URL Extractor

    MahaKurawa.My.ID URL Extractor is Simple Tool to extract unique URL

    MahaKurawa.My.ID URL Extractor is Simple Tool to extract unique URL from any text content in instant. It's useful when you lazy enough to identify and copy-paste URL from your content one by one by yourself.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    UniEM

    UniEM

    Unified embedding model

    UniEM is a unified embedding model designed to create high-quality text embeddings for various natural language processing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Medusa

    Medusa

    Framework for Accelerating LLM Generation with Multiple Decoding Heads

    Medusa is a framework aimed at accelerating the generation capabilities of Large Language Models (LLMs) by employing multiple decoding heads. This approach allows for parallel processing during text generation, significantly enhancing throughput and reducing response times. Medusa is designed to be simple to implement and integrates with existing LLM infrastructures, making it a practical solution for scaling LLM applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion dictionaries, stopwords). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Promptify

    Promptify

    se GPT or other prompt based models to get structured output

    Promptify is an open-source Python library designed to simplify prompt engineering and the development of natural language processing pipelines using large language models. The project provides tools that help developers generate structured prompts for different NLP tasks and apply them across multiple generative AI systems. Instead of manually crafting prompts for each task, Promptify introduces a unified architecture that combines prompt templates, language model interfaces, and processing pipelines into a single framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Prime QA

    Prime QA

    State-of-the-art Multilingual Question Answering research

    PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    auto-subtitle

    auto-subtitle

    Automatically generate and overlay subtitles for any video

    auto-subtitle is a Python-based command-line tool that automatically generates and overlays subtitles on video files using AI-driven speech recognition. It combines FFmpeg with OpenAI’s Whisper model to transcribe spoken audio into text and synchronize it with video playback. The tool processes video input, extracts audio, and produces subtitle files that can be either exported separately or burned directly into the final video output. It supports multiple transcription models with varying...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    textacy

    textacy

    NLP, before and after spaCy

    textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB