A set of Docker images for training and serving models in TensorFlow
Fast inference engine for Transformer models
AIMET is a library that provides advanced quantization and compression
Superduper: Integrate AI models and machine learning workflows
Deep learning optimization library: makes distributed training easy
Trainable models and NN optimization tools
The unofficial python package that returns response of Google Bard
DoWhy is a Python library for causal inference
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Implementation of "Tree of Thoughts
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
MII makes low-latency and high-throughput inference possible
Operating LLMs in production
Official inference library for Mistral models
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Sequence-to-sequence framework, focused on Neural Machine Translation
Guide to deploying deep-learning inference networks
Toolkit for allowing inference and serving with MXNet in SageMaker