Repo of Qwen2-Audio chat & pretrained large audio language model
Fast multimodal LLM for real-time voice interaction and AI apps
A specialized Claude Code workspace for creating long-form
Curated collection of Amazing Python scripts
Virtual AI anchor that combines state-of-the-art technology
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
AI suite powered by state-of-the-art models and providing advanced AI
Toolkit for conversational AI
Autonomous agents for everyone
A natural language interface for computers
Chat & pretrained large audio language model proposed by Alibaba Cloud
A Claude skill that automatically posts personalized comments
Convert VoIP calls to text and analyze them with AI
HTML5 js recording mp3 wav ogg webm amr format
Live analysis of pitches, harmonics, chords, and keys.
Toolkit for audio, music, and speech generation
2D open source actuator simulation software
Application which detects musical notes from the microphone.
3D open source actuator simulation software
Voice dialogue, role-playing, multi-topic discussion, picture creation
General Speech Restoration
Code for the Psygraph mobile application
Fully Functional GCS(Ground Control System) for Zuppa Autopilot