Large Audio Language Model built for natural interactions
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Streaming Real-time Audio-Driven Avatar Generation
BlackHole is a modern macOS audio loopback driver
Synchronized Translation for Videos
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Fast multimodal LLM for real-time voice interaction and AI apps
Automated YouTube Shorts pipeline
Video translation and dubbing tool powered by LLMs
Framework for building real-time voice and multimodal AI agents
Virtual modular synthesizer plugin
A gallery that showcases on-device ML/GenAI use cases
Translate the video from one language to another and embed dubbing
Generate audiobooks from EPUBs, PDFs and text with captions
Robust Speech Recognition via Large-Scale Weak Supervision
Generate audiobooks from e-books, voice cloning & 1107+ languages
Cross-platform, customizable ML solutions
Stream VR games from your PC to your headset via Wi-Fi
Taming Stable Diffusion for Lip Sync
High-resolution models for human tasks
A python tool that uses GPT-4, FFmpeg, and OpenCV
A Systematic Framework for Interactive World Modeling
Clean network diagrams, One-time setup, zero upkeep
Solidity Compiler for Solana, Polkadot and Stellar