Audio generation using diffusion models, in PyTorch
A gradio web UI for running Large Language Models like LLaMA
An open source RDP server
Robust Speech Recognition via Large-Scale Weak Supervision
Transcribe any audio to text, translate and edit subtitles 100% locall
LilyPond sheet music text editor
Remote desktop and file transfer tool
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Speech recognition module for Python
A safe home for all your data
A deep learning toolkit for Text-to-Speech, battle-tested in research
A web application that allows users to interact with OpenAI's models
Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.
Label Studio is a multi-type data labeling and annotation tool
Speech-to-text, text-to-speech, and speaker recognition
JUCE is an open-source cross-platform C++ application framework
Implementation of AudioLM audio generation model in Pytorch
Implementation of NÜWA, attention network for text to video synthesis
Implementation of MusicLM music generation model in Pytorch
Git extension for versioning large files
API samples for the Universal Windows Platform.
Integrate with the latest language models, image generation and speech
The most powerful screen recorder & annotation tool for Chrome
Transforming Multimodal Content into Captivating Multilingual Audio
A speech-text foundation model for real time dialogue