Single-cell analysis in Python
A scalable inference server for models optimized with OpenVINO
Easiest and laziest way for building multi-agent LLMs applications
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Port of OpenAI's Whisper model in C/C++
A high-performance inference system for large language models
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Low-latency REST API for serving text-embeddings
AIMET is a library that provides advanced quantization and compression
Bolt is a deep learning library with high performance
Uplift modeling and causal inference with machine learning algorithms
Serving system for machine learning models
Everything you need to build state-of-the-art foundation models
An Easy-to-Use and High-Performance AI Deployment Framework
High-Resolution Image Synthesis with Latent Diffusion Models
OpenMLDB is an open-source machine learning database
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction
Official inference framework for 1-bit LLMs
Easy-to-use deep learning framework with 3 key features
Unified Model Serving Framework
A library for accelerating Transformer models on NVIDIA GPUs
Jlama is a modern LLM inference engine for Java
The official Python client for the Huggingface Hub
A Customizable Image-to-Video Model based on HunyuanVideo
AirLLM 70B inference with single 4GB GPU