Large Language Model Text Generation Inference
A library for accelerating Transformer models on NVIDIA GPUs
Multilingual Automatic Speech Recognition with word-level timestamps
Standardized Serverless ML Inference Platform on Kubernetes
Run 100B+ language models at home, BitTorrent-style
A unified framework for scalable computing
A Pythonic framework to simplify AI service building
Easiest and laziest way for building multi-agent LLMs applications
Data manipulation and transformation for audio signal processing
State-of-the-art Parameter-Efficient Fine-Tuning
State-of-the-art diffusion models for image and audio generation
A lightweight vision library for performing large object detection
Unified Model Serving Framework
LLMFlows - Simple, Explicit and Transparent LLM Apps
MII makes low-latency and high-throughput inference possible
The unofficial python package that returns response of Google Bard
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Official inference library for Mistral models
A high-performance ML model serving framework, offers dynamic batching
Trainable models and NN optimization tools
Replace OpenAI GPT with another LLM in your app
LLM training code for MosaicML foundation models
A set of Docker images for training and serving models in TensorFlow
Libraries for applying sparsification recipes to neural networks