Showing 1283 open source projects for "python text"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    TextWorld

    TextWorld

    ​TextWorld is a sandbox learning environment for the training

    TextWorld is a learning environment designed to train reinforcement learning agents to play text-based games, where actions and observations are entirely in natural language. Developed by Microsoft Research, TextWorld focuses on language understanding, planning, and interaction in complex, narrative-driven environments. It generates games procedurally, enabling scalable testing of agents’ natural language processing and decision-making abilities.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    FastKoko

    FastKoko

    Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

    FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    kapture

    kapture

    Tools for manipulating datasets

    Kapture is a pivot file format, based on text and binary files, used to describe SfM (Structure From Motion) and more generally sensor-acquired data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Desloppify

    Desloppify

    Agent harness to make your slop code well-engineered and beautiful

    Desloppify is a utility-focused project aimed at improving the quality, structure, and clarity of generated or written text by removing redundancy, noise, and unnecessary verbosity. It is designed to “clean up” outputs, particularly those produced by AI systems, making them more concise, readable, and professional. The system likely applies heuristics or transformation rules to identify repetitive patterns, filler content, and stylistic inconsistencies. This makes it especially useful in...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    Hunyuan3D-1

    Hunyuan3D-1

    A Unified Framework for Text-to-3D and Image-to-3D Generation

    Hunyuan3D-1 is an earlier version in the same 3D generation line (the unified framework for text-to-3D and image-to-3D tasks) by Tencent Hunyuan. It provides a framework combining shape generation and texture synthesis, enabling users to create 3D assets from images or text conditions. While less advanced than version 2.1, it laid the foundations for the later PBR, higher resolution, and open-source enhancements. (Note: less detailed public documentation was found for Hunyuan3D-1 compared to...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...
    Downloads: 137 This Week
    Last Update:
    See Project
  • 7
    OpenMed

    OpenMed

    Open source healthcare AI

    ...OpenMed can be used in three main ways: as a simple Python API for scripts and notebooks, as a Docker-friendly FastAPI service for backend integration, and as a batch-processing system for multi-document workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    SciSpaCy

    SciSpaCy

    A full spaCy pipeline and models for scientific/biomedical documents

    ScispaCy is a spaCy extension optimized for processing biomedical and scientific text, providing domain-specific NLP models for tasks like named entity recognition (NER) and dependency parsing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    deepdoctection

    deepdoctection

    A Repo For Document AI

    ...For more specific text processing tasks use one of the many other great NLP libraries.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 11
    GLM-TTS

    GLM-TTS

    Controllable & emotion-expressive zero-shot TTS

    GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B),...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 14
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    Speakr is an open-source, real-time text-to-speech (TTS) web application that allows users to convert written text into natural-sounding speech in just a few clicks. It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications. Behind the scenes, Speakr leverages modern TTS engines and streaming audio technologies to deliver smooth and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Gitingest

    Gitingest

    Create prompt-friendly codebase digests from any Git repository URL

    ...In addition to producing the code digest, Gitingest also calculates statistics about the extracted content such as repository structure, total size of the extract, and token count. Gitingest can be used as a command line utility or integrated directly into Python applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 17
    SafeClaw

    SafeClaw

    Chat with it via text and voice

    SafeClaw is an open-source, entirely local alternative to cloud-based AI assistants like OpenClaw, enabling users to build a personal assistant that runs on their own machine without incurring API usage charges or exposing data to third-party services. It emphasizes privacy and predictability by using traditional programming, rule-based intent parsing, and established machine learning tools rather than large language models, meaning there are no per-token API costs and deterministic...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 18
    BlogWizard

    BlogWizard

    Generate blog articles from video or audio

    BlogWizard is a demo/utility project built on top of Groq’s LLM infrastructure that converts video or audio content into well-structured blog posts, enabling creators to repurpose multimedia content into text — useful for SEO, accessibility, or reaching audiences that prefer reading. The tool uses transcription (e.g. via Whisper) to extract text from audio/video, then runs an LLM-based generation pipeline to transform that content into coherent, readable blog-format posts — with sections,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Lingua-Py

    Lingua-Py

    The most accurate natural language detection library for Python

    Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pyTermTk

    pyTermTk

    Python Terminal Toolkit - a Spiced Up TUI Library

    pyTermTk is a Text-based user interface library (TUI). Evolved from the discontinued project pyCuT and inspired by a mix of Qt5, GTK, and tkinter API definition with a touch of personal interpretation.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    MedGemma

    MedGemma

    Collection of Gemma 3 variants that are trained for performance

    MedGemma is a collection of specialized open-source AI models created by Google as part of its Health AI Developer Foundations initiative, built on the Gemma 3 family of transformer models and trained for medical text and image comprehension tasks that help accelerate the development of healthcare-focused AI applications. It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    VideoCaptioner is an open source AI-powered subtitle processing tool designed to simplify the workflow of creating subtitles for videos. It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    GLM-4-Voice

    GLM-4-Voice

    GLM-4-Voice | End-to-End Chinese-English Conversational Model

    GLM-4-Voice is an open-source speech-enabled model from ZhipuAI, extending the GLM-4 family into the audio domain. It integrates advanced voice recognition and generation with the multimodal reasoning capabilities of GLM-4, enabling smooth natural interaction via spoken input and output. The model supports real-time speech-to-text transcription, spoken dialogue understanding, and text-to-speech synthesis, making it suitable for conversational AI, virtual assistants, and accessibility...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Flet

    Flet

    Flet enables developers to easily build realtime web and mobile apps

    ...With Flet you just write a monolith stateful app in Python only and get a multi-user, real-time Single-Page Application (SPA). To start developing with Flet, you just need your favorite IDE or text editor. With no SDKs, no thousands of dependencies, no complex tooling, Flet has a built-in web server with assets hosting and desktop clients.
    Downloads: 32 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB