Offline inference engine for art, real-time voice conversations
A high-quality rapid TTS voice cloning model
Universal Radio Hacker: Investigate Wireless Protocols Like A Boss
Framework for building realtime multimodal voice AI agents apps
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Open source no-code system for text annotation and building of text
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Agent harness to make your slop code well-engineered and beautiful
A Family of Open Sourced Music Foundation Models
Faster Whisper transcription with CTranslate2
The official Python SDK for the ElevenLabs API
Industrial-level controllable zero-shot text-to-speech system
State-of-the-art TTS model under 25MB
Extensions for Python Markdown
Snippet solution for Vim
AI video generator optimized for low VRAM and older GPUs use
A lightweight text-to-speech model with zero-shot voice cloning
Translate the video from one language to another and embed dubbing
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Converts text to speech in realtime
Qwen3-omni is a natively end-to-end, omni-modal LLM
Lightweight Markdown-only skills for autonomous ML research
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation