20+ high-performance LLMs with recipes to pretrain, finetune at scale
Replace OpenAI GPT with another LLM in your app
Visual Instruction Tuning: Large Language-and-Vision Assistant
Optimizing inference proxy for LLMs
Neural Network Compression Framework for enhanced OpenVINO
Framework that is dedicated to making neural data processing
Openai style api for open large language models
Efficient few-shot learning with Sentence Transformers
A Unified Library for Parameter-Efficient Learning
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Probabilistic reasoning and statistical analysis in TensorFlow
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Low-latency REST API for serving text-embeddings
Tensor search for humans
Implementation of "Tree of Thoughts
Implementation of model parallel autoregressive transformers on GPUs
A computer vision framework to create and deploy apps in minutes
CPU/GPU inference server for Hugging Face transformer models