Search Results for "open source speech to text software" - Page 6

Showing 542 open source projects for "open source speech to text software"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    fairseq2

    fairseq2

    FAIR Sequence Modeling Toolkit 2

    fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines.
    Downloads: 45 This Week
    Last Update:
    See Project
  • 3
    FastAPI

    FastAPI

    FastAPI framework, high performance, easy to learn, fast to code

    FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. Great editor support. Completion everywhere. Less time debugging. Designed to be easy to use and learn. Less time reading docs. Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs. Get production-ready code. With automatic interactive documentation. Based on (and fully compatible with) the open standards for APIs: OpenAPI...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 4
    ART ASCII Library

    ART ASCII Library

    ASCII art library for Python

    ASCII art is also known as "computer text art". It involves the smart placement of typed special characters or letters to make a visual shape that is spread over multiple lines of text. ART is a Python lib for text converting to ASCII art fancy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    PyGPT

    PyGPT

    Open source personal AI Assistant for Linux, Windows and Mac

    PyGPT is a desktop application that allows you to talk to OpenAI's LLM models such as GPT4 and GPT3 using your own computer and OpenAI API. It allows you to talk in chat mode and in completion mode, as well as generate images using DALL-E 2. PyGPT also adds access to the Internet for GPT via Google Custom Search API and Wikipedia API and includes voice synthesis using Microsoft Azure Text-to-Speech API. Moreover, the application has implemented context memory support, context storage,...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Imagen - Pytorch

    Imagen - Pytorch

    Implementation of Imagen, Google's Text-to-Image Neural Network

    Implementation of Imagen, Google's Text-to-Image Neural Network that beats DALL-E2, in Pytorch. It is the new SOTA for text-to-image synthesis. Architecturally, it is actually much simpler than DALL-E2. It consists of a cascading DDPM conditioned on text embeddings from a large pre-trained T5 model (attention network). It also contains dynamic clipping for improved classifier-free guidance, noise level conditioning, and a memory-efficient unit design. It appears neither CLIP nor prior...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    ...Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open-source software which means that transparency is the core value of our software development. Source code can be reviewed and improved by anyone from anywhere. Papermerge supports multiple users. Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 8
    pywinauto

    pywinauto

    Windows GUI Automation with Python (based on text properties)

    pywinauto is a set of Python modules to automate the Microsoft Windows GUI. At its simplest it allows you to send mouse and keyboard actions to Windows dialogs and controls, but it has support for more complex actions like getting text data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    Luna AI

    Luna AI

    Virtual AI anchor that combines state-of-the-art technology

    Luna AI is a virtual AI streamer framework designed to power an interactive VTuber that can go live on major platforms and chat with viewers in real time. It is built around a core assistant persona called “Luna AI,” which can be driven by a wide range of large language models and platforms, including GPT-style APIs, Claude, LangChain-based backends, ChatGLM, Kimi, Ollama, and many others. The project supports multiple rendering backends for the avatar, such as Live2D, Unreal Engine (UE),...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    GitSavvy

    GitSavvy

    Full git and GitHub integration with Sublime Text

    Sublime Text plugin providing probably all git has to offer. Sublime Text 2 is not supported. Also, GitSavvy takes advantage of modern features of Sublime Text (like annotations). For the best experience, use the latest Sublime Text dev build. The documentation is probably outdated. Yeah it's sad but you can contribute and I will eventually get onto it but every special view has help available, just press ?. GitSavvy requires Git versions at or greater than 2.18.0. basic Git functionality;...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    rich

    rich

    Rich is a Python library for rich text and beautiful formatting

    The Rich API makes it easy to add color and style to terminal output. Rich can also render pretty tables, progress bars, markdown, syntax highlighted source code, tracebacks, and more, out of the box. Rich is a Python library for rich text and beautiful formatting in the terminal. Rich works with Linux, OSX, and Windows. True color/emoji works with new Windows Terminal, classic terminal is limited to 16 colors. Rich requires Python 3.7 or later. Effortlessly add rich output to your...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Nano PDF Editor

    Nano PDF Editor

    Edit PDF files with Nano Banana

    Nano PDF Editor is a minimalist, portable PDF viewer and toolkit that focuses on simplicity, speed, and ease of integration for applications that need basic PDF rendering without heavy dependencies. It provides core functionality such as page navigation, zooming, text selection, and rendering directly to native graphics surfaces, making it suitable for lightweight PDF viewing scenarios on desktop or embedded platforms. Designed to be easily embedded into larger software projects, Nano-PDF...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 14
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and...
    Downloads: 300 This Week
    Last Update:
    See Project
  • 17
    Mice MX OS speech to text Voice Control

    Mice MX OS speech to text Voice Control

    Mice speech to text with MX Cinnamon OS ISO

    Note about this image This image contains a system based on Linux MX, which was created to improve accessibility within the Linux environment. The distribution uses the Cinnamon desktop interface, which is configured to be operated using voice commands and outputs. The user interface and the control of your own devices and home automation systems can be customized and extended. The voice control program MiceStTM.py was developed to enable easy adaptation to other languages. However, only...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Rasa

    Rasa

    Open source machine learning framework to automate text conversations

    Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual assistants on Facebook Messenger, Slack, Google Hangouts, Webex Teams, Microsoft Bot Framework, Rocket.Chat, Mattermost, Telegram, and Twilio or on your own custom conversational channels. Rasa helps you build contextual assistants capable of having layered conversations with lots of back-and-forths. In order for a human to have a meaningful exchange with a...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    CadQuery

    CadQuery

    A python parametric CAD scripting framework based on OCCT

    CadQuery is an intuitive, easy-to-use Python library for building parametric 3D CAD models. It has several goals. Build models with scripts that are as close as possible to how you’d describe the object to a human, using a standard, already established programming language. Create parametric models that can be very easily customized by end users. Output high-quality CAD formats like STEP and AMF in addition to traditional STL. Provide a non-proprietary, plain text model format that can be...
    Downloads: 42 This Week
    Last Update:
    See Project
  • 21
    Rhino

    Rhino

    On-device Speech-to-Intent engine powered by deep learning

    Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a given context of interest, in real-time. The end-to-end platform for embedding private voice AI into any software in a few lines of code. Design with no limits on top of a modular platform. Create use-case-specific voice AI models in seconds. Develop voice features with a few lines of code using intuitive and cross-platform SDKs. Deliver voice AI everywhere: on-device, mobile, web browsers,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    TextDistance

    TextDistance

    Compute distance between sequences

    Python library for comparing the distance between two or more sequences by many algorithms. For main algorithms, text distance try to call known external libraries (fastest first) if available (installed in your system) and possible (this implementation can compare this type of sequences). Install text distance with extras for this feature. Textdistance use benchmark results for algorithm optimization and try to call the fastest external lib first (if possible). TextDistance show benchmarks...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ShortGPT

    ShortGPT

    AI framework for automated short video creation and editing tools

    ShortGPT is an experimental AI-powered framework designed to automate the creation of short-form and long-form video content. It provides a structured system that handles multiple stages of the content creation workflow, including script generation, asset sourcing, voiceover synthesis, and video editing. ShortGPT uses large language models to generate scripts and prompts that guide the automated editing and production process. ShortGPT includes specialized content engines that manage...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Codespell

    Codespell

    Check code for common misspellings

    ...By focusing on commonly mistyped words and programming-specific terms, Codespell helps improve the readability and professionalism of open-source projects and enterprise software.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    kapture

    kapture

    Tools for manipulating datasets

    Kapture is a pivot file format, based on text and binary files, used to describe SfM (Structure From Motion) and more generally sensor-acquired data.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB