Toolkit for conversational AI
Generating Immersive, Explorable, and Interactive 3D Worlds
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Converts text to speech in realtime
The behavior guidance framework for customer-facing LLM agents
A TTS that fits in your CPU (and pocket)
MTEB: Massive Text Embedding Benchmark
Stable Diffusion web UI
Python implementation of TextRank algorithms
An open-source toolkit for monitoring Language Learning Models (LLMs)
A Systematic Framework for Interactive World Modeling
An Open Source text-to-speech system built by inverting Whisper
Multimodal embedding and reranking models built on Qwen3-VL
A Unified Framework for Text-to-3D and Image-to-3D Generation
Easy-to-use and powerful NLP library with Awesome model zoo
The official Python SDK for the ElevenLabs API
Collection of Gemma 3 variants that are trained for performance
Generate audiobooks from e-books
LLM abstractions that aren't obstructions
State-of-the-art TTS model under 25MB
A community-supported supercharged version of paperless
High-Resolution Image Synthesis with Latent Diffusion Models
Build Vision Agents quickly with any model or video provider
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Python tool for converting files and office documents to Markdown