Openai style api for open large language models
Run Local LLMs on Any Device. Open-source
A high-throughput and memory-efficient inference and serving engine
Ready-to-use OCR with 80+ supported languages
The official Python client for the Huggingface Hub
Everything you need to build state-of-the-art foundation models
GPU environment management and cluster orchestration
State-of-the-art diffusion models for image and audio generation
Uncover insights, surface problems, monitor, and fine tune your LLM
FlashInfer: Kernel Library for LLM Serving
Deep learning optimization library: makes distributed training easy
Standardized Serverless ML Inference Platform on Kubernetes
Library for OCR-related tasks powered by Deep Learning
20+ high-performance LLMs with recipes to pretrain, finetune at scale
The Triton Inference Server provides an optimized cloud
Official inference library for Mistral models
Tensor search for humans
Bring the notion of Model-as-a-Service to life
AIMET is a library that provides advanced quantization and compression
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Data manipulation and transformation for audio signal processing
A Pythonic framework to simplify AI service building
Open-source tool designed to enhance the efficiency of workloads
State-of-the-art Parameter-Efficient Fine-Tuning
An MLOps framework to package, deploy, monitor and manage models