Visual Instruction Tuning: Large Language-and-Vision Assistant
Libraries for applying sparsification recipes to neural networks
Optimizing inference proxy for LLMs
The unofficial python package that returns response of Google Bard
Open platform for training, serving, and evaluating language models
Openai style api for open large language models
Neural Network Compression Framework for enhanced OpenVINO
Efficient few-shot learning with Sentence Transformers
A Unified Library for Parameter-Efficient Learning
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Framework that is dedicated to making neural data processing
Probabilistic reasoning and statistical analysis in TensorFlow
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Build your chatbot within minutes on your favorite device
Low-latency REST API for serving text-embeddings
Powering Amazon custom machine learning chips
Implementation of "Tree of Thoughts
Implementation of model parallel autoregressive transformers on GPUs
A computer vision framework to create and deploy apps in minutes
CPU/GPU inference server for Hugging Face transformer models