Repo of Qwen2-Audio chat & pretrained large audio language model
A specialized Claude Code workspace for creating long-form
Fast multimodal LLM for real-time voice interaction and AI apps
Curated collection of Amazing Python scripts
AI suite powered by state-of-the-art models and providing advanced AI
Toolkit for conversational AI
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A natural language interface for computers
Autonomous agents for everyone
A Claude skill that automatically posts personalized comments
Where Models and Agents Co-Evolve
Convert VoIP calls to text and analyze them with AI
HTML5 js recording mp3 wav ogg webm amr format
Virtual AI anchor that combines state-of-the-art technology
Live analysis of pitches, harmonics, chords, and keys.
Chat & pretrained large audio language model proposed by Alibaba Cloud
Application which detects musical notes from the microphone.
Toolkit for audio, music, and speech generation
2D open source actuator simulation software
3D open source actuator simulation software
Voice dialogue, role-playing, multi-topic discussion, picture creation
General Speech Restoration
A python package to analyze and compare voices with deep learning
Code for the Psygraph mobile application