Automatically translates the text of a video based on a subtitle file
Label Studio is a multi-type data labeling and annotation tool
AI tool converting video/audio into structured documents instantly
Code and models for ICML 2024 paper, NExT-GPT
Trying to be a robust, user-friendly and hackable music player
Stealth Chromium that passes every bot detection test
A2M is a desktop app that converts AUDIO TO MIDI in one click.
One-click deployment (including offline integration package)
A TTS model capable of generating ultra-realistic dialogue
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Towards Human-Sounding Speech
A Systematic Framework for Interactive World Modeling
State-of-the-art diffusion models for image and audio generation
Voice Recognition to Text Tool
A python tool that uses GPT-4, FFmpeg, and OpenCV
Open Source Speech Language Model
Translate the video from one language to another and embed dubbing
A sound cloning tool with a web interface, using your voice
The official Python library for the OpenAI API
Spring AI Alibaba examples for building and testing AI apps
VMZ: Model Zoo for Video Modeling
ImageBind One Embedding Space to Bind Them All
GenAI Processors is a lightweight Python library
Mopidy is an extensible music server written in Python
An Open Source text-to-speech system built by inverting Whisper