Perso AI
Perso AI Dubbing is an AI-powered video dubbing and translation platform that localizes content into 33+ languages in minutes, with speech recognition in 99+ languages. Teams upload a video, select target languages, and receive a studio-quality dubbed version — complete with lip-sync and voice cloning that preserves the original speaker's tone, accent, and emotion.
Key capabilities:
• AI Voice Cloning — Matches the original speaker's voice and emotional tone
• AI Lip Sync — Aligns translated audio with on-screen mouth movements
• Auto Subtitle Generation — Creates and exports subtitles automatically
• Script Editor — Review and refine translations per speaker
• Multi-Speaker Support — Detects and dubs up to 10 speakers per video
Trusted by 450,000+ users across 80+ countries. Starts at $6.99/month. Developed by ESTsoft (est. 1993, KOSDAQ: 047560) — ISO/IEC 27001 certified.
Learn more
Play.ht
AI Powered Text to Voice Generation.
Play.ht offers uncanny, high-fidelity AI Voices for any project where you need human-sounding voice overs and performances.
Hollywood studios, auto manufacturers, and other large enterprises use Play.ht to create realistic and engaging voiceovers quickly, without the hassle of scheduling and hiring voice talent. Our voices sound natural, expressive, and engaging, just like human voice talent.
Play.ht offers API access as well as an online rich-text editor that allows you to generate entire performances with multiple speakers, edit their pacing, and generate unique versions of each paragraph - all within seconds.
Join other companies looking to scale up and simplify their voice work by scheduling a live demo today.
Learn more
Voxtral TTS
Voxtral TTS is a state-of-the-art, multilingual text-to-speech model designed to generate highly realistic and emotionally expressive speech from text, combining strong contextual understanding with advanced speaker modeling to produce natural, human-like audio output. Built as a lightweight model with around 4 billion parameters, it delivers efficient performance while maintaining high quality, enabling scalable deployment for enterprise voice applications. It supports nine major languages and diverse dialects, and can adapt to new voices using only a short reference audio sample, capturing not just tone but also rhythm, pauses, intonation, and emotional nuance. Its zero-shot voice cloning capabilities allow it to replicate a speaker’s style without additional training, and it can even perform cross-lingual voice adaptation, generating speech in one language while preserving the accent of another.
Learn more
Voicv
Voicv is a cutting-edge voice cloning platform that transforms your voice into a digital asset in minutes, supporting multiple languages and zero-shot learning. It allows users to clone any voice with just a 10-30-second audio sample, maintaining high fidelity and natural expression. It supports multiple languages, including English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish. Voicv offers real-time processing, enabling fast voice generation suitable for quick iterations and production needs. It achieves professional-quality output with extremely low error rates, ensuring clear and accurate speech generation. Users can access Voicv through a web interface or desktop applications. For enterprise users, Voicv provides a production-ready API and comprehensive documentation for seamless integration.
Learn more