Uplift modeling and causal inference with machine learning algorithms
Efficient few-shot learning with Sentence Transformers
OpenMMLab Model Deployment Framework
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
A library to communicate with ChatGPT, Claude, Copilot, Gemini
Operating LLMs in production
Open-source tool designed to enhance the efficiency of workloads
FlashInfer: Kernel Library for LLM Serving
Trainable models and NN optimization tools
Probabilistic reasoning and statistical analysis in TensorFlow
Data manipulation and transformation for audio signal processing
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Optimizing inference proxy for LLMs
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
A Unified Library for Parameter-Efficient Learning
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
A lightweight vision library for performing large object detection
State-of-the-art diffusion models for image and audio generation
PyTorch extensions for fast R&D prototyping and Kaggle farming
The Triton Inference Server provides an optimized cloud
A high-performance ML model serving framework, offers dynamic batching
Framework that is dedicated to making neural data processing
Official inference library for Mistral models
PyTorch library of curated Transformer models and their components