Run Local LLMs on Any Device. Open-source
Port of Facebook's LLaMA model in C/C++
An MLOps framework to package, deploy, monitor and manage models
Create HTML profiling reports from pandas DataFrame objects
Optimizing inference proxy for LLMs
OpenMMLab Model Deployment Framework
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Fast inference engine for Transformer models
Trainable models and NN optimization tools
A set of Docker images for training and serving models in TensorFlow
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
OpenMMLab Video Perception Toolbox