Sparsity-aware deep learning inference runtime for CPUs
Large Language Model Text Generation Inference
A set of Docker images for training and serving models in TensorFlow
Easiest and laziest way for building multi-agent LLMs applications
Open-source tool designed to enhance the efficiency of workloads
Easy-to-use Speech Toolkit including Self-Supervised Learning model
AIMET is a library that provides advanced quantization and compression
A library for accelerating Transformer models on NVIDIA GPUs
Trainable models and NN optimization tools
Visual Instruction Tuning: Large Language-and-Vision Assistant
Data manipulation and transformation for audio signal processing
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
A lightweight vision library for performing large object detection
LLM training code for MosaicML foundation models
DoWhy is a Python library for causal inference
PyTorch library of curated Transformer models and their components
State-of-the-art Parameter-Efficient Fine-Tuning
Optimizing inference proxy for LLMs
Neural Network Compression Framework for enhanced OpenVINO
Build your chatbot within minutes on your favorite device
Multilingual Automatic Speech Recognition with word-level timestamps
Library for OCR-related tasks powered by Deep Learning
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
GPU environment management and cluster orchestration
Probabilistic reasoning and statistical analysis in TensorFlow