Easy Docker setup for Stable Diffusion with user-friendly UI
Open source AI VTuber platform with voice chat and Live2D avatars
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Multilingual Automatic Speech Recognition with word-level timestamps
Text and image to video generation: CogVideoX and CogVideo
Tensor Learning in Python
Generate audiobooks from e-books
SGLang is a fast serving framework for large language models
Official inference framework for 1-bit LLMs
Openai style api for open large language models
Sparsity-aware deep learning inference runtime for CPUs
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Library for OCR-related tasks powered by Deep Learning
2D and 3D Face alignment library build using pytorch
Towards Human-Sounding Speech
A simple native web interface that uses ChatTTS to synthesize text
The largest collection of PyTorch image encoders / backbones
The Triton Inference Server provides an optimized cloud
Standardized Serverless ML Inference Platform on Kubernetes
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
HivisionIDPhotos: a lightweight and efficient AI ID photos tools
A lightweight text-to-speech model with zero-shot voice cloning
Fast State-of-the-Art Static Embeddings
Z80-μLM is a 2-bit quantized language model