Repo of Qwen2-Audio chat & pretrained large audio language model
Python Audio Analysis Library: Feature Extraction, Classification
Chat & pretrained large audio language model proposed by Alibaba Cloud
AudioMuse-AI is an Open Source Dockerized environment
Fast multimodal LLM for real-time voice interaction and AI apps
Get your documents ready for gen AI
Large Multimodal Models for Video Understanding and Editing
Private chat with local GPT with document, images, video, etc.
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Toolkit for audio, music, and speech generation
SPPAS - the automatic annotation and analyses of speech
An extremely simple tool for separating vocals and background music
Task of transcribing piano recordings into MIDI files
General Speech Restoration
IPTV/NVR/CCTV/Video cloud https://fastocloud.com
Recommends music based upon your current taste.