Running large language models on a single GPU
Chinese Llama-3 LLMs) developed from Meta Llama 3
Text and image to video generation: CogVideoX and CogVideo
Open source AI VTuber platform with voice chat and Live2D avatars
Tensor Learning in Python
AIMET is a library that provides advanced quantization and compression
A simple native web interface that uses ChatTTS to synthesize text
SGLang is a fast serving framework for large language models
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Multilingual Automatic Speech Recognition with word-level timestamps
Openai style api for open large language models
Sparsity-aware deep learning inference runtime for CPUs
Generate audiobooks from e-books
Geometric deep learning extension library for PyTorch
2D and 3D Face alignment library build using pytorch
The largest collection of PyTorch image encoders / backbones
Official inference framework for 1-bit LLMs
The Triton Inference Server provides an optimized cloud
Z80-μLM is a 2-bit quantized language model
Simplifies the local serving of AI models from any source
Towards Human-Sounding Speech
Standardized Serverless ML Inference Platform on Kubernetes
Open deep learning compiler stack for cpu, gpu, etc.
A Python library for audio data augmentation
Fast State-of-the-Art Static Embeddings