model-builder free download

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection.

Downloads: 74 This Week

Last Update: 2025-06-26

See Project

Faster Whisper

Faster Whisper transcription with CTranslate2

Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical. It supports multiple model sizes, allowing users to balance speed and accuracy based on their needs. ...

Downloads: 61 This Week

Last Update: 2026-04-06

See Project

WhisperX

Automatic Speech Recognition with Word-level Timestamps

WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy.

Downloads: 59 This Week

Last Update: 2026-05-25

See Project

Insanely Fast Whisper

An opinionated CLI to transcribe Audio files w/ Whisper on-device

...The tool provides a streamlined CLI interface, making it easy to run transcription tasks on local files or URLs without needing to write custom code. It supports multiple Whisper model variants, including distilled versions for faster inference with minimal accuracy loss.

Downloads: 2 This Week

Last Update: 2026-03-26

See Project

DeepSpeech

Open source embedded speech-to-text engine

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.

Downloads: 3 This Week

Last Update: 2021-04-08

See Project

Search Results for "model-builder"

Showing 5 open source projects for "model-builder"

Whisper

Faster Whisper

WhisperX

Insanely Fast Whisper

DeepSpeech

Search Results for "model-builder"

Showing 5 open source projects for "model-builder"

Whisper

Faster Whisper

WhisperX

Insanely Fast Whisper

DeepSpeech

Related Searches

Related Categories