A Unified Library for Parameter-Efficient Learning
Sparsity-aware deep learning inference runtime for CPUs
Neural Network Compression Framework for enhanced OpenVINO
Bring the notion of Model-as-a-Service to life
Libraries for applying sparsification recipes to neural networks
Openai style api for open large language models
Large Language Model Text Generation Inference
Efficient few-shot learning with Sentence Transformers
Uncover insights, surface problems, monitor, and fine tune your LLM
An easy-to-use LLMs quantization package with user-friendly apis