Run Local LLMs on Any Device. Open-source
A high-throughput and memory-efficient inference and serving engine
Everything you need to build state-of-the-art foundation models
The official Python client for the Huggingface Hub
Simplifies the local serving of AI models from any source
State-of-the-art diffusion models for image and audio generation
A set of Docker images for training and serving models in TensorFlow
Replace OpenAI GPT with another LLM in your app
Optimizing inference proxy for LLMs
Sparsity-aware deep learning inference runtime for CPUs
Data manipulation and transformation for audio signal processing
GPU environment management and cluster orchestration
Multilingual Automatic Speech Recognition with word-level timestamps
Uncover insights, surface problems, monitor, and fine tune your LLM
Deep learning optimization library: makes distributed training easy
Standardized Serverless ML Inference Platform on Kubernetes
Lightweight Python library for adding real-time multi-object tracking
Tensor search for humans
Libraries for applying sparsification recipes to neural networks
Gaussian processes in TensorFlow
Single-cell analysis in Python
Neural Network Compression Framework for enhanced OpenVINO
Openai style api for open large language models
Large Language Model Text Generation Inference
Training and deploying machine learning models on Amazon SageMaker