Large Language Model Text Generation Inference
OpenVINO™ Toolkit repository
Sparsity-aware deep learning inference runtime for CPUs
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
LLM.swift is a simple and readable library
Openai style api for open large language models
Neural Network Compression Framework for enhanced OpenVINO
Efficient few-shot learning with Sentence Transformers
Libraries for applying sparsification recipes to neural networks
Bolt is a deep learning library with high performance
Bring the notion of Model-as-a-Service to life
A Unified Library for Parameter-Efficient Learning
The free, Open Source alternative to OpenAI, Claude and others
Build Production-ready Agentic Workflow with Natural Language
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
An easy-to-use LLMs quantization package with user-friendly apis
A real time inference engine for temporal logical specifications
Framework that is dedicated to making neural data processing
Self-contained Machine Learning and Natural Language Processing lib
Database system for building simpler and faster AI-powered application
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Fast and user-friendly runtime for transformer inference