PArallel Distributed Deep LEarning: Machine Learning Framework
An innovative library for efficient LLM inference
Optimizing inference proxy for LLMs
AICI: Prompts as (Wasm) Programs
A library for accelerating Transformer models on NVIDIA GPUs
Standardized Serverless ML Inference Platform on Kubernetes
A GPU-accelerated library containing highly optimized building blocks
Bayesian inference with probabilistic programming
Run serverless GPU workloads with fast cold starts on bare-metal
Simplifies the local serving of AI models from any source
GPU environment management and cluster orchestration
Unofficial (Golang) Go bindings for the Hugging Face Inference API
Lightweight Python library for adding real-time multi-object tracking
Easy-to-use deep learning framework with 3 key features
A unified framework for scalable computing
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Probabilistic reasoning and statistical analysis in TensorFlow
Phi-3.5 for Mac: Locally-run Vision and Language Models
Libraries for applying sparsification recipes to neural networks
Gaussian processes in TensorFlow
Single-cell analysis in Python
MII makes low-latency and high-throughput inference possible
Training and deploying machine learning models on Amazon SageMaker
A library to communicate with ChatGPT, Claude, Copilot, Gemini
Sparsity-aware deep learning inference runtime for CPUs