Run local LLMs like llama, deepseek, kokoro etc. inside your browser
User-friendly AI Interface
A high-performance ML model serving framework, offers dynamic batching
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Build Production-ready Agentic Workflow with Natural Language
Visual Instruction Tuning: Large Language-and-Vision Assistant
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere