Showing 1585 open source projects for "python text"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 1
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets...
    Downloads: 56 This Week
    Last Update:
    See Project
  • 4
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 23 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists. By producing Markdown rather than raw text,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    IOPaint

    IOPaint

    Image inpainting tool powered by SOTA AI Model

    IOPaint is a powerful open-source image editing tool focused on inpainting, outpainting, object removal, and general image manipulation driven by state-of-the-art AI models, delivering these capabilities through both local and hosted workflows. Designed to be fully self-hosted and flexible, IOPaint supports a variety of underlying generators and inpaint models — from LaMa erase networks to Stable Diffusion-based replace/object generation — giving users multiple ways to refine or reconstruct...
    Downloads: 34 This Week
    Last Update:
    See Project
  • 7
    WhisperSpeech

    WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper

    WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS:...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    pdfly

    pdfly

    CLI tool to extract (meta)data from PDF and manipulate PDF files

    A Python library designed for manipulating PDF files with functionalities for extraction, transformation, and document generation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    GitGutter

    GitGutter

    A Sublime Text 2/3 plugin to see git diff in gutter

    A Sublime Text plug-in to show information about files in a git repository. Gutter Icons indicating inserted, modified or deleted lines. Diff Popup with details about modified lines. Status Bar Text with information about file and repository and provides some commands like Goto Change to navigate between modified lines. Copy from Commit to copy the original content from the commit. Revert to Commit to revert a modified hunk to the original state in a commit. The diff popup shows the original...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 10
    Nugget

    Nugget

    Unlock the fullest potential of your device

    Nugget is a system customization toolkit designed to unlock hidden capabilities and advanced configuration options on supported devices, particularly within Apple ecosystems. It provides users with the ability to modify system-level settings that are typically restricted, such as status bar elements, lock screen behavior, and system daemons. The tool includes features for creating animated wallpapers, editing system UI elements, and enabling diagnostic or developer-oriented modes that are...
    Downloads: 90 This Week
    Last Update:
    See Project
  • 11
    GLM-4-Voice

    GLM-4-Voice

    GLM-4-Voice | End-to-End Chinese-English Conversational Model

    GLM-4-Voice is an open-source speech-enabled model from ZhipuAI, extending the GLM-4 family into the audio domain. It integrates advanced voice recognition and generation with the multimodal reasoning capabilities of GLM-4, enabling smooth natural interaction via spoken input and output. The model supports real-time speech-to-text transcription, spoken dialogue understanding, and text-to-speech synthesis, making it suitable for conversational AI, virtual assistants, and accessibility...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    LangCheck

    LangCheck

    Simple, Pythonic building blocks to evaluate LLM applications

    Simple, Pythonic building blocks to evaluate LLM applications.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    LLaMA-Mesh

    LLaMA-Mesh

    Unifying 3D Mesh Generation with Language Models

    LLaMA-Mesh is a research framework that extends large language models so they can understand and generate 3D mesh data alongside text. The system introduces a method for representing 3D meshes in a textual format by encoding vertex coordinates and face definitions as sequences that can be processed by a language model. By serializing 3D geometry into text tokens, the approach allows existing transformer architectures to generate and interpret 3D models without requiring specialized visual...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Qwen-Image

    Qwen-Image

    Qwen-Image is a powerful image generation foundation model

    Qwen-Image is a powerful 20-billion parameter foundation model designed for advanced image generation and precise editing, with a particular strength in complex text rendering across diverse languages, especially Chinese. Built on the MMDiT architecture, it achieves remarkable fidelity in integrating text seamlessly into images while preserving typographic details and layout coherence. The model excels not only in text rendering but also in a wide range of artistic styles, including...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    vim-ai

    vim-ai

    AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim

    vim-ai is an AI-powered assistant plugin for Vim and Neovim that brings language-model features directly into the editor. It allows users to generate code or text, edit selections in place, and carry on interactive chat-style conversations without leaving the terminal editing environment. The plugin is built around OpenAI-compatible APIs, which means it can work not only with OpenAI itself but also with compatible proxies and alternative providers. Its command set covers text completion,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    OpenMed

    OpenMed

    Open source healthcare AI

    ...OpenMed can be used in three main ways: as a simple Python API for scripts and notebooks, as a Docker-friendly FastAPI service for backend integration, and as a batch-processing system for multi-document workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 19
    SDL_GameControllerDB

    SDL_GameControllerDB

    A community sourced database of game controller mappings

    SDL_GameControllerDB is a community-maintained database of game controller mappings designed to be used with the SDL (Simple DirectMedia Layer) library’s Game Controller API for both SDL2 and SDL3. Because many controllers report different axes and button layouts depending on platform and manufacturer, this project provides a large text database (gamecontrollerdb.txt) that maps those raw inputs to a standardized layout for consistent use in games and applications across Windows, macOS,...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Mirascope

    Mirascope

    LLM abstractions that aren't obstructions

    Mirascope is a powerful, flexible, and user-friendly library that simplifies the process of working with LLMs through a unified interface that works across various supported providers, including OpenAI, Anthropic, Mistral, Gemini, Groq, Cohere, LiteLLM, Azure AI, Vertex AI, and Bedrock. Whether you're generating text, extracting structured information, or developing complex AI-driven agent systems, Mirascope provides the tools you need to streamline your development process and create...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Texify

    Texify

    Math OCR model that outputs LaTeX and markdown

    Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    HunyuanWorld 1.0

    HunyuanWorld 1.0

    Generating Immersive, Explorable, and Interactive 3D Worlds

    HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations. This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines, and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Pixoo

    Pixoo

    A library to help you make the most out of your Pixoo 64

    Pixoo is a Python-based library for controlling Divoom Pixoo LED displays using Bluetooth Low Energy (BLE). It allows users to send images, animations, or text to Pixoo devices, enabling creative integrations like desktop widgets, real-time data displays, or custom artwork.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB