Stanford NLP Python library for many human languages
Free, high-quality text-to-speech API endpoint to replace OpenAI
Open source no-code system for text annotation and building of text
Towards Human-Level Text-to-Speech through Style Diffusion
A Powerful Native Multimodal Model for Image Generation
Label Studio is a multi-type data labeling and annotation tool
Tools like web browser, computer access and code runner for LLMs
Capable of understanding text, audio, vision, video
Qwen-Image is a powerful image generation foundation model
Underthesea - Vietnamese NLP Toolkit
An open-source toolkit for monitoring Language Learning Models (LLMs)
A nearly-live implementation of OpenAI's Whisper
Open source machine learning framework to automate text conversations
Dataset of GPT-2 outputs for research in detection, biases, and more
MTEB: Massive Text Embedding Benchmark
SOTA Open Source TTS
Chat & pretrained large audio language model proposed by Alibaba Cloud
Controllable and fast Text-to-Speech for over 7000 languages
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Generate audiobooks from e-books, voice cloning & 1107+ languages
Industrial-level controllable zero-shot text-to-speech system
Implementation of Phenaki Video, which uses Mask GIT
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Lightning-fast, on-device TTS, running natively via ONNX
Speech-AI-Forge is a project developed around TTS generation model