An easy-to-use LLMs quantization package with user-friendly apis
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Operating LLMs in production
Lightweight Python library for adding real-time multi-object tracking
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
LLMFlows - Simple, Explicit and Transparent LLM Apps
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Phi-3.5 for Mac: Locally-run Vision and Language Models
Efficient few-shot learning with Sentence Transformers
A Unified Library for Parameter-Efficient Learning
Official inference library for Mistral models
Images to inference with no labeling
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
GPU environment management and cluster orchestration
PyTorch library of curated Transformer models and their components
Toolbox of models, callbacks, and datasets for AI/ML researchers
Framework that is dedicated to making neural data processing
MII makes low-latency and high-throughput inference possible
Trainable models and NN optimization tools
PyTorch extensions for fast R&D prototyping and Kaggle farming
Probabilistic reasoning and statistical analysis in TensorFlow
Low-latency REST API for serving text-embeddings
Multilingual Automatic Speech Recognition with word-level timestamps
Implementation of "Tree of Thoughts
Run 100B+ language models at home, BitTorrent-style