DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
Features
- Uses a model trained by machine learning techniques
- Based on Baidu's Deep Speech research paper
- Uses Google's TensorFlow to make the implementation easier
- A pre-trained English model is available for use
- Download important inference material from the DeepSpeech releases page
- Run in real time on all devices
License
Mozilla Public License 2.0 (MPL 2.0)Follow DeepSpeech
Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform
Build generative AI apps with Vertex AI. Switch between models without switching platforms.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of DeepSpeech!