INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Large Language Model Text Generation Inference
DoWhy is a Python library for causal inference
Data manipulation and transformation for audio signal processing
Efficient few-shot learning with Sentence Transformers
Uplift modeling and causal inference with machine learning algorithms
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
A library to communicate with ChatGPT, Claude, Copilot, Gemini
A high-performance ML model serving framework, offers dynamic batching
State-of-the-art diffusion models for image and audio generation
A Unified Library for Parameter-Efficient Learning
Trainable models and NN optimization tools
Probabilistic reasoning and statistical analysis in TensorFlow
The Triton Inference Server provides an optimized cloud
Uncover insights, surface problems, monitor, and fine tune your LLM
OpenMMLab Model Deployment Framework
Multilingual Automatic Speech Recognition with word-level timestamps
Optimizing inference proxy for LLMs
LMDeploy is a toolkit for compressing, deploying, and serving LLMs
A lightweight vision library for performing large object detection
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Easy-to-use deep learning framework with 3 key features
PyTorch extensions for fast R&D prototyping and Kaggle farming
Framework that is dedicated to making neural data processing
Unified Model Serving Framework