Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.
Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
Try It Free
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.
Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
TTS engne for Lithuanian language synthesation based on LIEPA project (https://www.xn--ratija-ckb.lt/liepa/infrastrukturines-paslaugos/elektroninio-teksto-skaitytuvas/7563)
mp3 library, advanced ID3V1 and ID3V2 tagger, player. Organize a large mp3 library, over 40,000 songs. Speech synthesis and tag backup utilities. Scripts to maintain and organize song files.
Speech recognition application builder and library
Java library and tools to create open source speech recognition applications. Generates dialogs for conversational interfaces. Works with a popular open source speech recognition library.
Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
A speech synthesis and recognition library that is cross-platform, accessible from Java and C++, and has a very small API. Uses CMU Sphinx4 and FreeTTS internally.
A JNI wrapper for pjsip. You can use this wrapper to develop Java applications using the pjsip library. At the moment only the pjsua API is implemented. If you would like to obtain a commercial license, or need customisations, please contact us.
Arabisc is speaker independent large vocabulary continuous speech recognizer for Arabic language released under GNU license.It is also a collection of open source tools that allows researchers and developers to build speech recognition systems for Arab
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud
Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
Scalable Language API (SLAPI) The most comprehensive architecture for conversational natural-language applications including speech recognition/synthesis, semantics, & machine translation. Used on Android & other mobile app platforms.
Lisn uses the JNI to interface with various media players(winamp/itunes) using either the COM interface or HWND calls. Allows intuitive interaction with the media players as well as a quasi heuristic algorithm for volume control(under development)
This is a Java Wrapper for Cepstral.com text-to-speech engine. Cepstral makes very affordable realistic synthetic voices and provides the developers with C++ API's. We have developed a JSAPI compliant Java-to-JNI-to-C++ Wrapper to use with Cepstral TTS.
Speech based User Interface Components Library for Java is a project to create Java controls and applications that can be used not only by literate people but also by non-literates. Speech and visual element with minimal text is used to create components
JeSpeak is a Java library that bridges eSpeak, which is a compact open source software speech synthesizer. JeSpeak uses JNI to make native call to libespeak.
AGTK is a suite of software components for building tools for annotating linguistic signals,
time-series data which documents any kind of linguistic behavior (e.g. audio, video).
The internal data structures are based on annotation graphs.
VoxForge collects user-submitted speech audio files for the creation of Acoustic Models for Free and Open Source Speech Recognition Engines such as HTK, Julius, ISIP and Sphinx.
The Lazybones project is a server and a collection of clients. It allows one person to voice control multiple machines. Designed for software developers and sys admins, Lazybones speeds up redundant and/or complex tasks.
DTW is intended to be a Voice in -> Pictures + Text out program written in java using Sphinx from CMU. This is intended to be useful to people who have good oral/visual literacy skills but poor written literacy skills.
Time of day service over telephone using Voicent Gateway, a VoiceXML gateway that specially designed for voice modems. A Free version is available for download at http://www.voicent.com/download. Sample code for interactive telephony applications.
OC Volume is a speech recognition engine written in Java for integration with other applications. It is currently an User-Dependent Isolated Word Recognizer and can be expanded to include more capability for recognition.