CLIP, Predict the most relevant text snippet given an image
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
The most accurate natural language detection library for Python
Create videos with Stable Diffusion
Chat with it via text and voice
Label Studio is a multi-type data labeling and annotation tool
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Framework for building real-time voice and multimodal AI agents
Official Python inference and LoRA trainer package
Industrial-level controllable zero-shot text-to-speech system
Implementation of Phenaki Video, which uses Mask GIT
AI-powered tool for generating, optimizing, and translating subtitles
Unifying 3D Mesh Generation with Language Models
A nearly-live implementation of OpenAI's Whisper
Generating Immersive, Explorable, and Interactive 3D Worlds
Open source healthcare AI
Toolkit for conversational AI
SoTA open-source TTS
A fast TTS architecture with conditional flow matching
LLM
Knowledge Graph Generation from Any Text
A community-supported supercharged version of paperless
Scalable data pre processing and curation toolkit for LLMs
OCR model for complex documents with layout-aware structured outputs
A very simple framework for state-of-the-art NLP