Get your documents ready for gen AI
Framework for building realtime multimodal voice AI agents apps
Large Multimodal Models for Video Understanding and Editing
Private chat with local GPT with document, images, video, etc.
LLM Large Model of Selling Anchor
Controllable and fast Text-to-Speech for over 7000 languages
Build cross-modal and multimodal applications on the cloud
A feature packed DJ console and internet radio client for Linux users
A Conversational Speech Generation Model
Software that uses AI to perform real-time voice conversion
B language compiler written in Python targeting RISVM
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
An extremely simple tool for separating vocals and background music
Code for the paper Hybrid Spectrogram and Waveform Source Separation
MahaKurawa.My.ID MP4 VA Extract is a tool to extract mp4 file content
User-friendly library to find similar objects
Automatically generate and overlay subtitles for any video
Real-time music generation using stable diffusion techniques AI
Using OpenAI's Whisper to automatically generate YouTube subtitles
Task of transcribing piano recordings into MIDI files
A CLI script to generate subtitle files (SRT/VTT/TXT) for any video
Separate audio recordings into individual sources
We provide a PyTorch implementation of the paper Voice Separation
Voice chats, private incoming and outgoing calls in Telegram