Run Local LLMs on Any Device. Open-source
Efficient few-shot learning with Sentence Transformers
Official inference library for Mistral models
A library for accelerating Transformer models on NVIDIA GPUs
The Triton Inference Server provides an optimized cloud
Easy-to-use Speech Toolkit including Self-Supervised Learning model
The unofficial python package that returns response of Google Bard
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Lightweight anchor-free object detection model
Training & Implementation of chatbots leveraging GPT-like architecture