Repo of Qwen2-Audio chat & pretrained large audio language model
Fast multimodal LLM for real-time voice interaction and AI apps
Curated collection of Amazing Python scripts
A specialized Claude Code workspace for creating long-form
Virtual AI anchor that combines state-of-the-art technology
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
AI suite powered by state-of-the-art models and providing advanced AI
Toolkit for conversational AI
Autonomous agents for everyone
A natural language interface for computers
Chat & pretrained large audio language model proposed by Alibaba Cloud
A Claude skill that automatically posts personalized comments
Convert VoIP calls to text and analyze them with AI
HTML5 js recording mp3 wav ogg webm amr format
Live analysis of pitches, harmonics, chords, and keys.
Toolkit for audio, music, and speech generation
Application which detects musical notes from the microphone.
2D open source actuator simulation software
3D open source actuator simulation software
Voice dialogue, role-playing, multi-topic discussion, picture creation
General Speech Restoration
Code for the Psygraph mobile application
Fully Functional GCS(Ground Control System) for Zuppa Autopilot