Standardized Serverless ML Inference Platform on Kubernetes
Run Local LLMs on Any Device. Open-source
Training and deploying machine learning models on Amazon SageMaker
State-of-the-art Parameter-Efficient Fine-Tuning
Low-latency REST API for serving text-embeddings
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
A set of Docker images for training and serving models in TensorFlow
PyTorch extensions for fast R&D prototyping and Kaggle farming
MII makes low-latency and high-throughput inference possible
GPU environment management and cluster orchestration
Library for OCR-related tasks powered by Deep Learning
A library for accelerating Transformer models on NVIDIA GPUs
Large Language Model Text Generation Inference
Data manipulation and transformation for audio signal processing
A Pythonic framework to simplify AI service building
Openai style api for open large language models
Trainable models and NN optimization tools
Simplifies the local serving of AI models from any source
A high-performance ML model serving framework, offers dynamic batching
Unified Model Serving Framework
Probabilistic reasoning and statistical analysis in TensorFlow
Pytorch domain library for recommendation systems
Tensor search for humans
A unified framework for scalable computing
Deep learning optimization library: makes distributed training easy