GLM-TTS is an advanced text-to-speech synthesis system built on large language model technologies that focuses on producing high-quality, expressive, and controllable spoken output, including features like emotion modulation and zero-shot voice cloning. It uses a two-stage architecture where a generative LLM first converts text into intermediate speech token sequences and then a Flow-based neural model converts those tokens into natural audio waveforms, enabling rich prosody and voice character even for unseen speakers. The system introduces a multi-reward reinforcement learning framework that jointly optimizes for voice similarity, emotional expressiveness, pronunciation, and intelligibility, yielding output that can rival commercial options in naturalness and expressiveness. GLM-TTS also supports phoneme-level control and hybrid text + phoneme input, giving developers precise control over pronunciation critical for multilingual or polyphone­-rich languages.

Features

  • Zero-shot voice cloning from short prompt audio
  • Multi-reward reinforcement learning for expressive prosody
  • Two-stage LLM + Flow-based audio generation pipeline
  • Support for phoneme-level control and hybrid inputs
  • High-quality synthesis comparable with commercial TTS
  • Streaming real-time speech synthesis

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow GLM-TTS

GLM-TTS Web Site

Other Useful Business Software
Our Free Plans just got better! | Auth0 Icon
Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of GLM-TTS!

Additional Project Details

Programming Language

Python

Related Categories

Python Text to Speech Software, Python AI Models

Registered

1 day ago