speech decoder free download

Showing 4 open source projects for "speech decoder"

View related business solutions

AI Models Python Clear Filters & Widen Search

Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud
Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.

Try Cloud SQL Free
Go From Idea to Deployed AI App Fast
One platform to build, fine-tune, and deploy. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
1

IndexTTS2

Industrial-level controllable zero-shot text-to-speech system

IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. ...

Downloads: 10 This Week

Last Update: 2025-11-27
See Project
2

FireRedASR

Open-source industrial-grade ASR models

FireRedASR is an industrial-grade family of open-source automatic speech recognition models designed to provide high-precision speech-to-text performance across languages including Mandarin, English, and various Chinese dialects, achieving new state-of-the-art benchmarks on public test sets. The project includes multiple model variants to meet different application needs, such as high-accuracy end-to-end interaction using an encoder-adapter-LLM framework and efficient real-time recognition using attention-based encoder-decoder architectures, giving developers flexibility in balancing performance and resource constraints. ...

Downloads: 1 This Week

Last Update: 2026-02-16
See Project
3

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing.

Downloads: 4 This Week

Last Update: 2025-03-19
See Project
4

Denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)

Denoiser is a real-time speech enhancement model operating directly on raw waveforms, designed to clean noisy audio while running efficiently on CPU. It uses a causal encoder-decoder architecture with skip connections, optimized with losses defined both in the time domain and frequency domain to better suppress noise while preserving speech. Unlike models that operate on spectrograms alone, this design enables lower latency and coherent waveform output.

Downloads: 4 This Week

Last Update: 2025-10-07
See Project
Train ML Models With SQL You Already Know
BigQuery turns your data warehouse into an AI platform. No new languages required.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.

Try Free