Neural Network Compression Framework for enhanced OpenVINO
Efficient few-shot learning with Sentence Transformers
A Unified Library for Parameter-Efficient Learning
PyTorch library of curated Transformer models and their components
The unofficial python package that returns response of Google Bard
Open platform for training, serving, and evaluating language models
Probabilistic reasoning and statistical analysis in TensorFlow
Build your chatbot within minutes on your favorite device
Easiest and laziest way for building multi-agent LLMs applications
Low-latency REST API for serving text-embeddings
Tensor search for humans
Powering Amazon custom machine learning chips
LLMFlows - Simple, Explicit and Transparent LLM Apps
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Run 100B+ language models at home, BitTorrent-style
Implementation of "Tree of Thoughts
A computer vision framework to create and deploy apps in minutes
Implementation of model parallel autoregressive transformers on GPUs
The deep learning toolkit for speech-to-text
CPU/GPU inference server for Hugging Face transformer models