Everything you need to build state-of-the-art foundation models
Single-cell analysis in Python
High-Resolution Image Synthesis with Latent Diffusion Models
Official inference framework for 1-bit LLMs
A lightweight vLLM implementation built from scratch
The official Python client for the Huggingface Hub
Bring the notion of Model-as-a-Service to life
A library for accelerating Transformer models on NVIDIA GPUs
950 line, minimal, extensible LLM inference engine built from scratch
Jupyter notebook tutorials for OpenVINO
Faster Whisper transcription with CTranslate2
Wan2.1: Open and Advanced Large-Scale Video Generative Model
AirLLM 70B inference with single 4GB GPU
Unified Model Serving Framework
Gaussian processes in TensorFlow
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Training and deploying machine learning models on Amazon SageMaker
Operating LLMs in production
Official inference repo for FLUX.1 models
Powering Amazon custom machine learning chips
Code for running inference and finetuning with SAM 3 model
A set of Docker images for training and serving models in TensorFlow
Sparsity-aware deep learning inference runtime for CPUs
Performance-optimized AI inference on your GPUs
GLM-4.5: Open-source LLM for intelligent agents by Z.ai