C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
A set of Docker images for training and serving models in TensorFlow
Operating LLMs in production
Official inference library for Mistral models
DoWhy is a Python library for causal inference
Trainable models and NN optimization tools
Superduper: Integrate AI models and machine learning workflows
AIMET is a library that provides advanced quantization and compression
OpenAI swift async text to image for SwiftUI app using OpenAI
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
MII makes low-latency and high-throughput inference possible
Fast inference engine for Transformer models
Deep learning optimization library: makes distributed training easy
The unofficial python package that returns response of Google Bard
Lightweight inference library for ONNX files, written in C++
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Implementation of "Tree of Thoughts
Sequence-to-sequence framework, focused on Neural Machine Translation
Guide to deploying deep-learning inference networks
Toolkit for allowing inference and serving with MXNet in SageMaker