Showing 20 open source projects for "python text"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 143 This Week
    Last Update:
    See Project
  • 2
    RealtimeSTT

    RealtimeSTT

    A robust, efficient, low-latency speech-to-text library

    RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 58 This Week
    Last Update:
    See Project
  • 4
    WhisperX

    WhisperX

    Automatic Speech Recognition with Word-level Timestamps

    WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy. It is particularly effective for...
    Downloads: 35 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Faster Whisper

    Faster Whisper

    Faster Whisper transcription with CTranslate2

    Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical. It supports multiple model sizes, allowing users to balance speed and accuracy based...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 7
    Insanely Fast Whisper

    Insanely Fast Whisper

    An opinionated CLI to transcribe Audio files w/ Whisper on-device

    Insanely Fast Whisper is a high-performance command-line tool designed to dramatically accelerate speech-to-text transcription using OpenAI’s Whisper models on local hardware. It leverages modern optimizations such as batch processing, mixed precision, and advanced attention mechanisms like Flash Attention to significantly reduce inference time while maintaining high transcription accuracy. The project is built on top of the Transformers ecosystem and integrates with libraries such as...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Whisper Batch Transcriber

    Whisper Batch Transcriber

    Unlimited, private and free Speech-To-Text program

    ## About: Automatically transcribe all of your voice recordings into clean, organized, neat text files. It's free, fully automated, unlimited, using state-of-the-art speech-to-text technology. Works 100% offline on your computer, privately and locally. ## Usecases: Convert speeches, podcasts, webinars, monologues, storytellings and other audio speech into a formatted .txt file. One sentence per new line. ## Notes: - Its 2GB in size and requires 2-6GB of GPU VRAM too. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9

    SoundTranscriber

    SoundTranscriber can be used to generate automatic transcription / aut

    SoundTranscriber can be used to generate automatic transcription / aut
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Mice MX OS speech to text Voice Control

    Mice MX OS speech to text Voice Control

    Mice speech to text with MX Cinnamon OS ISO

    Note about this image This image contains a system based on Linux MX, which was created to improve accessibility within the Linux environment. The distribution uses the Cinnamon desktop interface, which is configured to be operated using voice commands and outputs. The user interface and the control of your own devices and home automation systems can be customized and extended. The voice control program MiceStTM.py was developed to enable easy adaptation to other languages. However, only...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11

    Mice TTM

    mice stt tts

    Dieses Tool wird speziell für die Barrierefreiheit unter Linux entwickelt. Es ermöglicht das umwandeln/konvertieren/parsen von Texten die aus einer Spracherkennung stammen, in Diktate sowie das Ausführen von Makros. Dies funktioniert ohne Internet, da die Spracherkennung auf dem PC selbst erfolgt. Mausbewegungen auf benannte Wörter und dann entsprechend auswählen oder per Sprachbefehl klicken. Außerdem können Textpassagen z.B. unter Libreoffice Wirter per Sprachbefehl entsprechend...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    VATSG

    VATSG

    Video automatic transcribe and translated subtitle generator

    It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your target language subtitle file(srt format) If you...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    MDictate

    MDictate

    Speech to text using python, pocketsphinx, ready to deploy

    Automated speech recognition software is extremely cumbersome. This project's aim is to incrementally improve the quality of an open-source and ready to deploy speech to text recognition system. Runs on Windows using the mdictate.exe, but the core workings are found in the mdictate.py script which should work on Windows/Linux/OS X. In version 1.0, we use pocketsphinx' default setup with a basic graphic interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    JAVT - Just Another Voice Transformer

    JAVT - Just Another Voice Transformer

    Just Another Speech Recognition and Text to Speech software.

    JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17

    Speech Sentiment Analysis

    Voice to Text Sentiment Analysis

    Voice to text Sentiment analysis converts the audio signal to text to calculate appropriate sentiment polarity of the sentence. The code currently works on one sentence at a time. Sentiment scoring is done on the spot using a speaker. The Speech to text processing system currently being used is the MS Windows speech to text converter. However significant modifications can be made for audio recognition by a refined signal processing system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Voice_Tic_Tac_Toe

    Play Tic Tac Toe With Voice Input

    Voice Tic Tac Toe enables you to play Tic Tac Toe via voice input. The game engine is developed in python programming language and uses Microsoft SAPI for Speech to Text Conversion.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Software to fit whole-sentence language models using the principle of maximum entropy. For developers of speech recognizers, text prediction interfaces, OCR, machine translation software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB