Run local LLMs like llama, deepseek, kokoro etc. inside your browser
PyTorch library of curated Transformer models and their components
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Data manipulation and transformation for audio signal processing
Simplifies the local serving of AI models from any source
Multi-lingual large voice generation model, providing inference
Qwen3 is the large language model series developed by Qwen team
Multilingual Automatic Speech Recognition with word-level timestamps
MNN is a blazing fast, lightweight deep learning framework
LightLLM is a Python-based LLM (Large Language Model) inference
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
Run Local LLMs on Any Device. Open-source
Mooncake is the serving platform for Kimi
Official Python inference and LoRA trainer package
Towards Human-Sounding Speech
Pruna is a model optimization framework built for developers
Any model. Any hardware. Zero compromise
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
High-speed Large Language Model Serving for Local Deployment
Libraries for applying sparsification recipes to neural networks
Pytorch domain library for recommendation systems
Open-Source AI Camera. Empower any camera/CCTV
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
A high-performance ML model serving framework, offers dynamic batching