Repo of Qwen2-Audio chat & pretrained large audio language model
Python Audio Analysis Library: Feature Extraction, Classification
Chat & pretrained large audio language model proposed by Alibaba Cloud
AudioMuse-AI is an Open Source Dockerized environment
Audio Plugin for Audio to MIDI transcription using deep learning
A library for audio and music analysis, feature extraction
Fast multimodal LLM for real-time voice interaction and AI apps
Cross-platform, customizable ML solutions
A suite of advanced multi-modal LLMs
Get your documents ready for gen AI
Large Multimodal Models for Video Understanding and Editing
Private chat with local GPT with document, images, video, etc.
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Toolkit for audio, music, and speech generation
Local AI file organization with categorization and rename suggestions
SPPAS - the automatic annotation and analyses of speech
Visual AI Workflow Builder
An extremely simple tool for separating vocals and background music
A library for audio and music analysis, feature extraction.
Common Resource Grep
Task of transcribing piano recordings into MIDI files
General Speech Restoration