Implementation of Phenaki Video, which uses Mask GIT
Open source no-code system for text annotation and building of text
Free, high-quality text-to-speech API endpoint to replace OpenAI
A simple native web interface that uses ChatTTS to synthesize text
Towards Human-Level Text-to-Speech through Style Diffusion
Open source machine learning framework to automate text conversations
Underthesea - Vietnamese NLP Toolkit
Tools like web browser, computer access and code runner for LLMs
An open-source toolkit for monitoring Language Learning Models (LLMs)
Multi-Voice and Prompt-Controlled TTS Engine
CLIP, Predict the most relevant text snippet given an image
Qwen3-omni is a natively end-to-end, omni-modal LLM
Qwen-Image is a powerful image generation foundation model
Dataset of GPT-2 outputs for research in detection, biases, and more
Label Studio is a multi-type data labeling and annotation tool
Toolkit for conversational AI
Controllable and fast Text-to-Speech for over 7000 languages
Implementation of Video Diffusion Models
SOTA Open Source TTS
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
LLM
A nearly-live implementation of OpenAI's Whisper
Chat & pretrained large audio language model proposed by Alibaba Cloud
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Implementation of Imagen, Google's Text-to-Image Neural Network