Speech Note Linux app. Note taking, reading and translating
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Document Image Parsing via Heterogeneous Anchor Prompting”
Large Multimodal Models for Video Understanding and Editing
Private chat with local GPT with document, images, video, etc.
Build Vision Agents quickly with any model or video provider
Controllable and fast Text-to-Speech for over 7000 languages
Build cross-modal and multimodal applications on the cloud
Transform your voice in real-time voxal voice changer
A Conversational Speech Generation Model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Chinese text-to-speech engine
User-friendly library to find similar objects
Applications of Deep Neural Networks
Common Resource Grep
Windows-GUI
Task of transcribing piano recordings into MIDI files
Text-to-Speech for Basque and Spanish
Separate audio recordings into individual sources
List of useful data augmentation resources
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
IPTV/NVR/CCTV/Video cloud https://fastocloud.com
Text-to-Speech TTS for Basque, Spanish, Catalan, Galician and English
Resources for speech processing in Brazilian Portuguese