PyTorch extensions for fast R&D prototyping and Kaggle farming
The Triton Inference Server provides an optimized cloud
Standardized Serverless ML Inference Platform on Kubernetes
Powering Amazon custom machine learning chips
Optimizing inference proxy for LLMs
Replace OpenAI GPT with another LLM in your app
Bring the notion of Model-as-a-Service to life
Library for serving Transformers models on Amazon SageMaker
A set of Docker images for training and serving models in TensorFlow
LLMFlows - Simple, Explicit and Transparent LLM Apps
Data manipulation and transformation for audio signal processing
State-of-the-art Parameter-Efficient Fine-Tuning
Probabilistic reasoning and statistical analysis in TensorFlow
Database system for building simpler and faster AI-powered application
Lightweight Python library for adding real-time multi-object tracking
Gaussian processes in TensorFlow
Low-latency REST API for serving text-embeddings
Libraries for applying sparsification recipes to neural networks
A library to communicate with ChatGPT, Claude, Copilot, Gemini
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Visual Instruction Tuning: Large Language-and-Vision Assistant
Adversarial Robustness Toolbox (ART) - Python Library for ML security
Openai style api for open large language models
A Unified Library for Parameter-Efficient Learning
Images to inference with no labeling