Run Local LLMs on Any Device. Open-source
FlashInfer: Kernel Library for LLM Serving
Phi-3.5 for Mac: Locally-run Vision and Language Models
Simplifies the local serving of AI models from any source
Neural Network Compression Framework for enhanced OpenVINO
A toolkit to optimize ML models for deployment for Keras & TensorFlow
State-of-the-art Parameter-Efficient Fine-Tuning
Replace OpenAI GPT with another LLM in your app
AIMET is a library that provides advanced quantization and compression
Probabilistic reasoning and statistical analysis in TensorFlow
An easy-to-use LLMs quantization package with user-friendly apis
Images to inference with no labeling
Database system for building simpler and faster AI-powered application