Multi-Voice and Prompt-Controlled TTS Engine
The behavior guidance framework for customer-facing LLM agents
Stanford NLP Python library for many human languages
Implementation of Phenaki Video, which uses Mask GIT
Open source no-code system for text annotation and building of text
Free, high-quality text-to-speech API endpoint to replace OpenAI
A simple native web interface that uses ChatTTS to synthesize text
Towards Human-Level Text-to-Speech through Style Diffusion
A Powerful Native Multimodal Model for Image Generation
Open source machine learning framework to automate text conversations
Underthesea - Vietnamese NLP Toolkit
An open-source toolkit for monitoring Language Learning Models (LLMs)
Tools like web browser, computer access and code runner for LLMs
CLIP, Predict the most relevant text snippet given an image
Qwen3-omni is a natively end-to-end, omni-modal LLM
Dataset of GPT-2 outputs for research in detection, biases, and more
Label Studio is a multi-type data labeling and annotation tool
Capable of understanding text, audio, vision, video
Toolkit for conversational AI
Controllable and fast Text-to-Speech for over 7000 languages
Implementation of Video Diffusion Models
A nearly-live implementation of OpenAI's Whisper
Qwen-Image is a powerful image generation foundation model
SOTA Open Source TTS
ComfyUI integration for Microsoft's VibeVoice text-to-speech model