Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.
Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
Try It Free
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.
Cloud SQL handles your database ops end to end, so you can focus on your app.
Speech to text using python, pocketsphinx, ready to deploy
Automated speech recognition software is extremely cumbersome. This project's aim is to incrementally improve the quality of an open-source and ready to deploy speech to text recognition system.
Runs on Windows using the mdictate.exe, but the core workings are found in the mdictate.py script which should work on Windows/Linux/OS X.
In version 1.0, we use pocketsphinx' default setup with a basic graphic interface.
Just Another Speech Recognition and Text to Speech software.
JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
Software to fit whole-sentence language models using the principle of maximum entropy. For developers of speech recognizers, text prediction interfaces, OCR, machine translation software.