StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech translation, and TTS, as well as their streaming or simultaneous counterparts, all handled by the same underlying system. During simultaneous translation, StreamSpeech can optionally output intermediate ASR transcripts and text translations, giving users or downstream applications real-time visibility into what the system is hearing and how it is translating.

Features

  • Unified model for ASR, speech translation, and TTS in both offline and streaming modes
  • Supports eight distinct tasks including simultaneous S2ST, S2TT, and real-time TTS
  • Outputs intermediate transcripts and translations for richer low-latency interaction
  • SimulEval integration and agent scripts for systematic streaming evaluation
  • Web GUI demo and project page with audio samples and visualizations
  • Achieves state-of-the-art performance on offline and simultaneous speech-to-speech translation

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

MIT License

Follow StreamSpeech

StreamSpeech Web Site

Other Useful Business Software
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of StreamSpeech!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

3 hours ago