Multilingual Automatic Speech Recognition with word-level timestamps
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
State-of-the-art diffusion models for image and audio generation
Create HTML profiling reports from pandas DataFrame objects
Open-source tool designed to enhance the efficiency of workloads
An MLOps framework to package, deploy, monitor and manage models
Deep learning optimization library: makes distributed training easy
Simplifies the local serving of AI models from any source
GPU environment management and cluster orchestration
LLM training code for MosaicML foundation models
FlashInfer: Kernel Library for LLM Serving
Optimizing inference proxy for LLMs
Neural Network Compression Framework for enhanced OpenVINO
Build your chatbot within minutes on your favorite device
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Probabilistic reasoning and statistical analysis in TensorFlow
Libraries for applying sparsification recipes to neural networks
Sparsity-aware deep learning inference runtime for CPUs
Large Language Model Text Generation Inference
Easiest and laziest way for building multi-agent LLMs applications
Efficient few-shot learning with Sentence Transformers
Superduper: Integrate AI models and machine learning workflows
MII makes low-latency and high-throughput inference possible
Official inference library for Mistral models
20+ high-performance LLMs with recipes to pretrain, finetune at scale