Chat & pretrained large audio language model proposed by Alibaba Cloud
Implementation of Imagen, Google's Text-to-Image Neural Network
GLM-4-Voice | End-to-End Chinese-English Conversational Model
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Lightning-fast, on-device TTS, running natively via ONNX
Speech-AI-Forge is a project developed around TTS generation model
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
A Model Context Protocol (MCP) server
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Chat & pretrained large vision language model
A deep learning toolkit for Text-to-Speech, battle-tested in research
Management of Yandex Station and other smart home devices
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
VITS2 backbone with multilingual-bert
A fast TTS architecture with conditional flow matching
21 Lessons, Get Started Building with Generative AI
A community-supported supercharged version of paperless
A very simple framework for state-of-the-art NLP
Open source personal AI Assistant for Linux, Windows and Mac
text and image to video generation: CogVideoX (2024) and CogVideo
Easy-to-use and powerful NLP library with Awesome model zoo
lightweight package to simplify LLM API calls
StreamSpeech is a seamless model for offline speech recognition
Industrial-level controllable zero-shot text-to-speech system
Obsei is a low code AI powered automation tool